Oracle PL/SQL Sudoku Solver

Overview

Links to Sudoku; or algorithms 1, 2, 3, or 4, or 5, userguide or download

There comes a time when the monotony of work needs a little refreshing diversion. Some of us like to relax (or frustrate) the time a way with a Sudoku puzzle which is fast becoming pandemic craze.

I enjoy solving these puzzles myself, but with a lust for a challenge, I could not resist the opportunity to develop a Sudoku puzzle solver written entirely in Oracle SQL and PL/SQL.

By developing this solver I have also demonstrated how it is possible to develop a relatively simple solution to solve a reasonably difficult problem.

Others have now joined in to help enhance the usability of the solver, with special thanks to Patrick Barel who has developed a very simple but effective Windows front-end in Delphi using Qnxo, which is now included in the download.

The solver will solve most of The Times Fiendish puzzles. The solver application does not support guessing, so you may find puzzles that it will not solve, though there is an argument  that a Sudoku puzzle should be solvable without guessing.

I must point out that I am not a Sudoku fanatic and not trying to present myself as someone who knows much about Sudoku puzzles. There are plenty of websites that do that with a huge amount of detail and attention. I am merely demonstrating how SQL can be used to solve an everyday problem.

The algorithms I have used in my solution were inspired by a very interesting article on Sudoku puzzle solving written by Paul Stephens, a UK Computer Journalist.

The Concepts

The approach used to solve a Sudoku puzzle with a computer is not as complex as you might think. However, it is probably not one you would necessarily contemplate using to solve the puzzle manually, or at least until you get stuck.
The approach is based on determining all possible candidates for a particular cell and then applying some basic rules and pattern matching to eliminate redundant candidates. As candidates get removed, remaining candidates become certainties, either by being:

The remaining candidate in a cell.
The candidate with the last remaining value in a box, row or column.

So the algorithms are doing two jobs; identifying certainties; and eliminating candidates. It is a just a case of iterating through these two steps applying the available algorithms, until no more candidates remain.
It is worth stressing that the solver does not support guessing, so some tricky puzzles may not be solvable, though there is an argument that a Sudoku puzzle should be solvable without guessing.
Solution Design

I have chosen SQL and Oracle PL/SQL to solve the puzzle, firstly because it is the programming language I use every day, and secondly SQL is probably the most suited language for implementing set theory solutions. The solver is implemented into a single stored procedure supported by a number internal procedures and functions. In hind sight, it would have been probably more correct to have developed it as a package, but since it is not really loosing out on the benefits gained by packages, I have kept it as a stored procedure, unless it evolves into something more sophisticated.

The algorithms are written purely in SQL, and PL/SQL is only used to control the iterations and solution output. Any other procedural language could be used to implement this solution, such as java, vb, php, Delphi, etc, with the same SQL embedded. However, the SQL itself is Oracle specific (mainly because of the DECODE function), but may adapt quite easily to other flavours of SQL if they can support the same degree of sub-querying.

Being a SQL solution the design consists of some database tables. It is a very simple design with one table holding dynamic data i.e. the answers table, and the others hold static reference data such as cells and combinations.

Puzzle Numbers - This table holds a list of all the possible numbers in a cell. For a standard Sudoku puzzle, this is only 9 rows holding values 1 to 9.

Cells - This table holds an entry for each cell in the puzzle using row id (horizontal line of 9 cells) and column id (vertical line of 9 cells) as the primary key, and box id (3x3 cells) as an attribute, e.g. cell 2,3 is in box 1; cell 6,6 is box 5, etc.

The table only holds 81 rows of data for a standard Sudoku puzzle. The table definition is:

Name                    Null?    Type
----------------------- -------- ----------------
ROW_ID                  NOT NULL NUMBER
COL_ID                  NOT NULL NUMBER
BOX_ID                  NOT NULL NUMBER

Combinations - This table only contains static data and used for the set based algorithm. It lists every member number of a given combination set. A combination set is a group of unique numbers that form a set with n members, e.g. numbers 3, 5, and 7 belong to the set "357", and 5, 6, and 8 all belong to the "568" set, 4 and 6 belong to the "46" set and so on. Since these are not permutations, i.e. where the order of the numbers is significant, set "234" is the same as "432", so in order to keep set identifier consistent, it always represented with numbers in sequence, i.e. use "234" rather than "432".

Name                    Null?    Type
----------------------- -------- ----------------
SET_ID                  NOT NULL NUMBER
PUZZLE_NUMBER           NOT NULL NUMBER SET_SIZE                NOT NULL NUMBER
 

The primary key of this table is the combination of set id and member number. The set size is there to filter on combination sets of a particular size. These are some examples of the combinations table entries:

 

  Set Id Puzzle Number Set Size  
  34 3 2  
  34 4 2  
  123 1 3  
  123 2 3  
  123 3 3  
  124 1 3  
  124 2 3  
  124 4 3  

There is an entry for every possible set of combined numbers, from sets of 2 numbers through to 5 numbers, i.e. 12 through to 56789.

Answers - This table is the only dynamic table and holds the starting numbers to the puzzle plus all initial candidates to any cells not containing a solved number. 

Name                    Null?    Type
----------------------- -------- ----------------
PUZZLE_ID               NOT NULL NUMBER
STEP_NO                 NOT NULL NUMBER
ROW_ID                  NOT NULL NUMBER
COL_ID                  NOT NULL NUMBER
PENCIL_MARK_IND         NOT NULL NUMBER
ANSWER                  NOT NULL NUMBER
BOX_ID                           NUMBER

Candidates are referred to as pencil marks, where a pencil mark goes through several states as indicated by the pencil_mark_ind flag, which has the following values:

  -1 The pencil mark has been rubbed out.  
  0 The pencil mark is now a certainty (all starting numbers are set to this value).  
1 Normal pencil mark setting. i.e. candidate.
  2 A temporary pencil mark setting identified by an algorithm step. This can then be used to identify other candidates to be rubbed out.  

 

 


The initial puzzle clues (which have a pencil mark = 0) are loaded via an Excel spreadsheet that I have put together which generates a series of insert statements from numbers input into a grid. The data is inserted into a staging table called puzzle_load, and copied to the answers table during the execution of the solver.

Algorithms

The solver only uses four different algorithms.  A typical "The Times Fiendish" Sudoku puzzle usually needs to use all of the algorithms to be solved, however "The Times Difficult" can be solved usually with just using the first two.

The algorithms are split into two categories - those that identify certainties in cells (first two) and those that eliminate candidates (last two).

The algorithms are explained in more detail in the following pages:

1. Singles - Cell

2. Singles -Box

3. Cross Hatching

4. Partial Members Set

5. Full Member Set

The iterations of these algorithms are not applied strictly in the sequence shown above. The Singles algorithms are always applied when any candidates have been removed by the other algorithms. Since the singles algorithms can remove candidates themselves, the algorithm is reiterated until no more certainties are identified. Then the next algorithm in sequence can be applied and iterated continuously until no more candidates are eliminated, followed by the singles algorithms again.

Conclusion

This Sudoku Solver solution has resulted in a fairly simple design and very little coding, and demonstrates how powerful SQL is in solving complex problems. It is far too easy for a programmer to go off and develop a solution using a procedural based programming language, one that they may be more familiar with or one they consider more in fashion with the latest technological preachings, than attempting to push the programming logic into SQL. I am not saying all problems are better solved with SQL, but I often find solutions far more complex than they need to be, mainly because of the reluctance to use SQL efficiently and effectively, or reluctance to accept that SQL is a superior programming option.
 


webmaster@db-innovations.co.uk : Copyright 2005 - 2010 Database Innovations Ltd : Last modified: 27/08/10