Merge pdf files using C#

Recently I had to create an assembly component using C# to merge multiple PDF files into one file. The specification was pretty straight forward:
1) Merge two of more PDF document into a single output PDF File.
2) Used on an ASP.NET Page

After a few minutes of GOOG, I came up to this “ITextSharp

iText# (iTextSharp) is a port of the iText open source java library written entirely in C# for the .NET platform. iText# is a library that allows you to generate PDF files on the fly. It is implemented as an assembly.

It must be noted that the assembly is coded and compiled using the .NET Framework 1.1. You might want to migrate to the version 2.0 or 3.5 of the .NET Framework.

With a some more research on PDF merging. I was able to create a class that will make use of the ITextSharp assembly and perform as the merge pdf operation as needed.

Here is the code:
[csharp]using iTextSharp.text;
using iTextSharp.text.pdf;

public class MergeEx
{
#region Fields
private string sourcefolder;
private string destinationfile;
private IList fileList = new ArrayList();
#endregion

#region Public Methods
///
/// Add a new file, together with a given docname to the fileList and namelist collection
///
public void AddFile(string pathnname)
{
fileList.Add(pathnname);
}

///
/// Generate the merged PDF
///
public void Execute()
{
MergeDocs();
}
#endregion

#region Private Methods
///
/// Merges the Docs and renders the destinationFile
///
private void MergeDocs()
{

//Step 1: Create a Docuement-Object
Document document = new Document();
try
{
//Step 2: we create a writer that listens to the document
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(destinationfile, FileMode.Create));

//Step 3: Open the document
document.Open();

PdfContentByte cb = writer.DirectContent;
PdfImportedPage page;

int n = 0;
int rotation = 0;

//Loops for each file that has been listed
foreach (string filename in fileList)
{
//The current file path
string filePath = sourcefolder + filename;

// we create a reader for the document
PdfReader reader = new PdfReader(filePath);

//Gets the number of pages to process
n = reader.NumberOfPages;

int i = 0;
while (i < n)
{
i++;
document.SetPageSize(reader.GetPageSizeWithRotation(1));
document.NewPage();

//Insert to Destination on the first page
if (i == 1)
{
Chunk fileRef = new Chunk(” “);
fileRef.SetLocalDestination(filename);
document.Add(fileRef);
}

page = writer.GetImportedPage(reader, i);
rotation = reader.GetPageRotation(i);
if (rotation == 90 || rotation == 270)
{
cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
}
else
{
cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
}
}
}
}
catch (Exception e) { throw e; }
finally { document.Close(); }
}
#endregion

#region Properties
///
/// Gets or Sets the SourceFolder
///
public string SourceFolder
{
get { return sourcefolder; }
set {sourcefolder = value; }
}

///
/// Gets or Sets the DestinationFile
///
public string DestinationFile
{
get { return destinationfile; }
set { destinationfile = value; }
}
#endregion
}
[/csharp]
To use the MergeEx class:
1) Initialize the class
2) Set the SourceFolder and DestinationFile properties
3) Using the AddFile method, add the source file names that need to be merged (Filename only since the SourceFolder has already been set)
4) Call the Execute Method

If everything works fine you will find your Merged PDF Document at your stated destination file.

The code is pretty much self decribed. If there is any question, i can always be contacted via this post.

27 comments on “Merge pdf files using C#

  1. re: http://www.wacdesigns.com/2008/10/03/merge-pdf-files-using-c/

    Thought I’d share a variation of your code I created. Given my code (below under ‘Extension Method Code’), you can do things like:

    var di = new DirectoryInfo( @”F:DocGenPDF” );

    using( var fs = new FileStream( @”F:DocGenbatch.pdf”, FileMode.Create ) )
    {
    di.MergePDFs( fs );
    }

    using( var fs = new FileStream( @”F:DocGenbatch two.pdf”, FileMode.Create ) )
    {
    di.GetFiles().Where( f => f.Name.Contains( “122” ) ).MergePDFs( fs );
    }

    var pdfFileContents = (IEnumerable)null; // get content files from db

    using ( var zip = new ZipOutputStream( target ) ) // use any stream type you want
    {
    var entry = new ZipEntry( “Batch.pdf” );
    zip.PutNextEntry( entry );
    pdfFileContents.MergePDFs( zip );
    }

    Extension method code:

    class PdfConvertFile
    {
    public string FileName { get; set; }
    public byte[] Content { get; set; }
    }

    public static class ExtensionMethods
    {
    public static void MergePDFs( this DirectoryInfo source, Stream outputStream )
    {
    MergePDFs( source.GetFiles( “*.pdf” ), outputStream );
    }
    public static void MergePDFs( this IEnumerable files, Stream putputStream )
    {
    MergePDFs( files.Select( f => new PdfConvertFile { FileName = f.FullName } ), outputStream );
    }

    public static void MergePDFs( this IEnumerable files, Stream outputStream )
    {
    MergePDFs( files.Select( f => new PdfConvertFile { Content = f } ), outputStream );
    }

    private static void MergePDFs( this IEnumerable files, Stream outputStream )
    {
    var document = new iTextSharp.text.Document();
    try
    {
    //Create a writer that listens to the document
    PdfWriter writer = PdfWriter.GetInstance( document, outputStream );

    //Open the document
    document.Open();

    var cb = writer.DirectContent;
    PdfImportedPage page;

    int n = 0;
    int rotation = 0;

    //Loops for each file that has been listed
    foreach ( var file in files )
    {
    // we create a reader for the document
    PdfReader reader = !string.IsNullOrEmpty( file.FileName )
    ? new PdfReader( file.FileName )
    : new PdfReader( file.Content );

    //Gets the number of pages to process
    n = reader.NumberOfPages;

    int i = 0;
    while ( i < n )
    {
    i++;
    document.SetPageSize( reader.GetPageSizeWithRotation( 1 ) );
    document.NewPage();

    page = writer.GetImportedPage( reader, i );
    rotation = reader.GetPageRotation( i );

    if ( rotation == 90 || rotation == 270 )
    {
    cb.AddTemplate( page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation( i ).Height );
    }
    else
    {
    cb.AddTemplate( page, 1f, 0, 0, 1f, 0, 0 );
    }
    }
    }
    }
    finally { document.Close(); }
    }
    }

  2. I have tried merging the pdfs to letter size.
    The first was merged exactly to letter size. But the second document which was smallet than letter size didn’t get merged to letter size.
    Can you please let me know how to merge pdfs to letter size irrespective of their size and content. please reply. Thanks in advance

  3. I am trying to merge 2 pdf file and result will be a new pdf. But content of 1st pdf file copy on second pdf file and content of second pdf file get removed.
    can any one help ?
    thanks in advance.
    Praveen Kumar
    E-Bix software Pvt. Ltd.

  4. good work but when I run it I keep getting an unarthorized permission exception, also for the addfile methood is this right.
    string sourcefolder= _txtEnterSource.Text;

    string destinationfile= _txtdestinationfile.Text;

    string pathnname = _txtPathName.Text;
    string pathname2 = _txtPathname2.Text;

    MergeEx m = new MergeEx();
    m.SourceFolder = sourcefolder;
    m.DestinationFile = destinationfile;

    m.AddFile(pathnname);
    m.AddFile(pathname2);

    m.Execute();

  5. @Praveen: can you provide sample of your code and the PDF you are trying to merge, i’ll give it a try.

    @Christopher: Please make sure that the source/destination folder has the the proper rights.

  6. @Daniel, I got some issue, trying to do that too, with some big images in a PDF file, i haven’t looked into it, since the PDF i had to merge, were all standard formats. I think you will need to determine the size and then do the merging.

  7. I was able to sort the rotation problem…
    page = writer.GetImportedPage(rdr, i);
    rotation = rdr.GetPageRotation(i);
    if (rotation == 90)
    {
    cb.AddTemplate(page, 0, -1f, 1f, 0, 0, rdr.GetPageSizeWithRotation(i).Height);
    }
    else if (rotation == 270)
    {
    cb.AddTemplate(page, 0, 1f, -1f, 0, rdr.GetPageSizeWithRotation(i).Width, 0);
    }
    else
    {
    cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
    }

    But now fixing one problem ends up with another problem… the PDF that is generated is 3 times larger than merging the files using Adobe Reader. Any solution or reasons why?

  8. hi love your code but i am having a problem …cant find a place where i can write page number and total pages
    like this
    1/10
    can you help me plz

  9. First of all, great tool!! I just have come across one simple issue that I can’t seem to resolve.

    I’ve got a set of documents where the page orientation changes within the individual documents.
    Doc1 page 1 – Portrait
    Doc1 page 2 – Portrait
    Doc2 page 1 – Portrait
    Doc2 page 2 – Landscape
    Doc2 page 3 – Landscape
    Doc2 page 4 – Landscape
    Doc2 page 5 – Portrait
    Doc3 page 1 – Portrait
    When I merge these, the 3 pages that are landscape, end up with the page oriented as portrait, but the contents oriented as landscape. This causes the text to get cut off on the pages that should have been landscape. I can force the text to orient correctly (so printing would be correct), but that means that all of teh pages are portrait, just with rotated text, which is lousy for reading on the screen. Any suggestions???

    1. Hello Franck, It’s been a while since I worked on this project.

      I’ve taken a quick look at the issue, but was unable to get out a working solution for the time being. The thing is that the module will take the first page as screen orientation. That is:

      If Doc2 – Page 1 is Portrait, it will be portrait all throughout. I you want to to be Landscape the first page should be Landscape. This is really just a tweak.

      Did you use the latest version of ITextReader available ? If yes, then the code in this post needs to be modified to account for the page orientation differently.

  10. 1′ L ooking for merging two pdf based on content.
    e.g 1. if pdf has one page with line of content and send page has some 20 rows. let say page is able to save 24 line in one page then is pdf merged pdf should have one page.

    2. if pdf has one page with 20 line and pdf 2 para of 4 line each. then 4 para should get merged with 1st pdf and merged pdf will have two pages.

  11. This library though is open source is not free for commercial web site usage (see: http://itextpdf.com/terms-of-use/index.php). It is not a big deal to use something that is well intended for developpers to use (and ofcourse that it cost money). Anyone here knows other free pdf resources that can be used for pdf documents merging?

    1. Thank you Tiron for you comment. At the time the post was written in 2008 the software was still open source and free to use for commercial and personal usage. Maybe wrong here, but i don’t think the new license apply to the source code from this version ?

  12. In the Chunk fileRef, when it does the document.Add(fileRef) content is empty, is that supposed to be that way?

    It’s not working for me, but I’m not getting errors and from stepping through it, that’s all I can find.
    Thanks.

    1. Hello Michelle,

      Can you provide the code you are using and the type of file you are trying to merge, to see if there are any issues ?

Leave a Reply

Your email address will not be published. Required fields are marked *