Python Sudoku Puzzle Solver using Backtracking Algorithm

When learning a new programming language, it is always beneficial to tackle a problem that keeps your interest. I always found writing a game usually does it for me. Writing computer games typically involves lots of programming aspects such as user interfaces, data structures, and algorithms.

I really enjoy sudoku puzzles. So while I was learning Python I decided to create a sudoku puzzle solver. There is a lot of Internet literature written on sudoku puzzle solving using the backtracking algorithm.

Sudoku Puzzle

Sudoku is a number placement puzzle that has become popular in the last 10 years. In a “classic” sudoku puzzle there is a 9×9 grid with 9, 3×3 sub-grids. The object is to place a number 1 through 9 in each square. Each number in a row and column in the 9×9 grid must be unique. Also each number in a 3×3 sub-grid must also be unique.

To begin, a partially solved puzzle is provided. Typically the fewer already placed numbers, the harder the puzzle. Each sudoku puzzle is designed to have only a single solution.

Backtracking Algorithm

The backtracking algorithm is a method for solving problems recursively by testing incremental solutions. If the solutions fails, then you “backtrack” the previous solution and attempt another solution.

The backtracking algorithm has been used to solve many problems including the Knight’s Tour Problem, Rat in a Maze, N Queen Problem, Subset Sum, Hamiltonian Cycle, and many others. A cool website, GeeksforGeeks, discusses the backtracking algorithm and application.

To solve a sudoku puzzle using the backtracking algorithm you place a value in the first empty square and test if that solution is valid. If the value is not valid, then the next value is tried. If the test solution is valid, then the next empty square is filled in with a value and then tested until a valid solution is found. This continues until all empty squares have a value.

Python BAcktracking Algorithm

The interface uses Tkinter buttons and creates a 9×9 grid. Code is dedicated for puzzle setup by clicking on a button to enter a known value. Some support functions were also created to check if the placed value is valid for row, column, and sub-grid (square).

The sudoku puzzle is represented as a 2-D array of integer values with 0 being empty. My Python backtracking works a row at a time. A number (1-9) is place in an empty cell and is tested for a valid entry. If the entry is not valid, then the next number is tested. When a number can be placed, the foreground color is changed to blue and the number added as text to the button (cell).

def solvePuzzle(sudoku):
    emptyFound = False
    for row in range(0,9):
        for column in range(0,9):
            if(sudoku[row][column] == 0):
                emptyFound = True
                for num in range(1,10):
                    sudoku[row][column] = num
                    if okRow(sudoku, row):
                        if okColumn(sudoku, column):
                            if okSquare(sudoku, row, column):
                                buttons[(row * 9) + column].configure(fg = 'blue')
                                buttons[(row * 9) + column].configure(text = num)
                                root.update()
                                if solvePuzzle(sudoku):
                                    return True
                                else:
                                    sudoku[row][column] = 0
                            else:
                                sudoku[row][column] = 0
                        else:
                            sudoku[row][column] = 0
                    else:
                        sudoku[row][column] = 0

                # no solution, tried all numbers but failed
                return False
                
    if not emptyFound:
        return True

The code segment below shows testing a row for valid entries. A simple array is used to accumulate the number of times a number is entered in the row. If any entry is greater than 1, then that number was duplicated and the row is not valid.

# check if row is good, number already in row
def okRow(sudoku, row):
    # check for duplicate 1-9 in this row
    check = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]  # count number is row
    for i in range(0, 9):
        check[sudoku[row][i]] += 1
    
    for i in range(1, 10):
        if(check[i] > 1):   # multiples of the same numbers have sum > 1
            return False
    return True

For me, most medium difficult sudoku puzzles are solved in about 10 minutes. Many of the puzzles I have solved using Python take less than 1 second using the backtracking algorithm. The longest solution for an extremely hard puzzle was about 5 minutes.

Combining PDF Files using Python

I was surprise how easy it was to combine separate PDFs into a single file using Python. I was scanning a large document and my scanner could not handle the entire document. So I had to perform multiple scans that created multiple PDF files. I was looking for free PDF combining software when I decided to search for Python code.

I found sample code on Stack Overflow that uses PyPDF2 library. The code isn’t fancy and uses listdir to get all PDF files in a common directory. I added the input directory but didn’t program a user interface so the directory files names need to sort in the order you want to merge. I had 17 files so I renamed them to 01.pdf, 02.pdf, … 10.pdf, 11.pdf, …, 16.pdf, 17.pdf.

The code below was developed under Python 3.8.2 on Windows 10 platform using the PyPDF2 library. To install the PyPDF2 library use pip install PyPDF2.

Code

import os
import sys                          # system interface (argvs)
from PyPDF2 import PdfFileMerger    # pdf library

def mergePDFs(pdfdir):
    if os.path.exists(pdfdir) :
        x = [a for a in os.listdir(pdfdir) if a.endswith('.pdf')]
    
        merger = PdfFileMerger()
        for pdf in x:
            merger.append(open(pdfdir + '\\' + pdf, 'rb'))
    
        if os.path.isfile(pdfdir + '\\' + 'result.pdf') is not True:
            with open(pdfdir + '\\' + 'result.pdf', 'wb') as fout:
                merger.write(fout)
        else:
            print("Output file result.pdf exists. Exit without saving.")

    else:
        print('Directory " + pdfdir + " does not exist.')
        
if __name__ == "__main__": 
    # get pdf input directory
    pdfdir = ''
    if len(sys.argv) == 2:
        pdfdir = sys.argv[1]
        
    if pdfdir == '':
        pdfdir = input('Input pdf input file directory: ')
    
    # call  function 
    mergePDFs(pdfdir) 

I like using command line arguments so the “main” startup function checks for an input argument that is the PDF directory for input and output. If the argument is missing then prompt the user.

A list x is created using os.listdir and files ending with .pdf. The list x only includes file names so the path is added when performing the append method. Once all the files have been appended then result file is written.

Error checking is performed to ensure the input directory exists and the output file is not present.

The next code improvement is to add a user interface where files are selected and ordered for merging.