News

Combining PDF Files using Python

I was surprise how easy it was to combine separate PDFs into a single file using Python. I was scanning a large document and my scanner could not handle the entire document. So I had to perform multiple scans that created multiple PDF files. I was looking for free PDF combining software when I decided to search for Python code.

I found sample code on Stack Overflow that uses PyPDF2 library. The code isn’t fancy and uses listdir to get all PDF files in a common directory. I added the input directory but didn’t program a user interface so the directory files names need to sort in the order you want to merge. I had 17 files so I renamed them to 01.pdf, 02.pdf, … 10.pdf, 11.pdf, …, 16.pdf, 17.pdf.

The code below was developed under Python 3.8.2 on Windows 10 platform using the PyPDF2 library. To install the PyPDF2 library use pip install PyPDF2.

Code

import os
import sys                          # system interface (argvs)
from PyPDF2 import PdfFileMerger    # pdf library

def mergePDFs(pdfdir):
    if os.path.exists(pdfdir) :
        x = [a for a in os.listdir(pdfdir) if a.endswith('.pdf')]
    
        merger = PdfFileMerger()
        for pdf in x:
            merger.append(open(pdfdir + '\\' + pdf, 'rb'))
    
        if os.path.isfile(pdfdir + '\\' + 'result.pdf') is not True:
            with open(pdfdir + '\\' + 'result.pdf', 'wb') as fout:
                merger.write(fout)
        else:
            print("Output file result.pdf exists. Exit without saving.")

    else:
        print('Directory " + pdfdir + " does not exist.')
        
if __name__ == "__main__": 
    # get pdf input directory
    pdfdir = ''
    if len(sys.argv) == 2:
        pdfdir = sys.argv[1]
        
    if pdfdir == '':
        pdfdir = input('Input pdf input file directory: ')
    
    # call  function 
    mergePDFs(pdfdir) 

I like using command line arguments so the “main” startup function checks for an input argument that is the PDF directory for input and output. If the argument is missing then prompt the user.

A list x is created using os.listdir and files ending with .pdf. The list x only includes file names so the path is added when performing the append method. Once all the files have been appended then result file is written.

Error checking is performed to ensure the input directory exists and the output file is not present.

The next code improvement is to add a user interface where files are selected and ordered for merging.

Announcing Anyware, LLC

On May 7, 2020 Anyware, LLC, a Maryland limited liability company, was formed as a single member business. Anyware, LLC was created to provide cost effective Computer Engineering hardware, software, and firmware development services for a range of applications and platforms in today’s gig-based economy.

Anyware, LLC is also a producer of consumer based app products. With the formation of the LLC, four apps are scheduled for release on the Fitbit App Gallery for the Fitbit VersaTM, Versa 2TM and IonicTM fitness tracker watches. These apps include:

  • Simple poker hand rankings list
  • Moon phase and illumination for any date
  • Sunrise and sunset for any date
  • Current preliminary air quality index (AQI) from closest monitoring station

Mr. Paul Shultz, Managing Member, has over 35-years of design engineering, product development, manufacturing, technical management, and executive leadership experience with commercial and military customers.