Generating Sudoku Variations

Generating and solving Sudoku puzzles often uses a programming technique developed in the 1960s known as ‘dancing links’. Conceptually, the technique creates a large set of ‘constraints. As an example, ‘the top row contains the digit one in one of those nine cells’ is one constraint, ‘the top row contains the digit two in one of those nine cells’ and so on for each digit in each row, column and 3x3 box. The final set of constraints is that each cell contains a digit. The program then makes an assumption and sees if that leads to each constraint being true when all the digits are filled in. If the assumption leads to a problem, found when a constraint is no longer possible, then that assumption is discarded and another one tries.

Dancing links worked fine for Trividokus since we just had to add another constraint for the Sudoku cells. Sudoku Islands also worked since each island was also just another set of nine constraints.

With Stars & Stripes Sudoku, we went from 9x9 to 12x12, where you have the digits 1-9 and three stars. This was a little trickier, but not too bad. The new constraints were now ‘the top row contains THREE stars’, a slight change from a single digit, but it still was within the framework.

In Sudoku Atlantis, we had to switch to a new technique. Some of the new cell requirements – even, odd, greater than 6, less than 5 and so on could have been processed. We just had to remove the invalid digits, but our ‘magic 5’ requirement made it ugly. With the magic 5, any space with a five filled in did not ‘allow’ larger numbers immediately above it in the grid. Since this requirement was not static, it only came into being when a ‘5’ was filled in, there was sensible constraint for it.

Keeping the concept of constraints, we inverted the process. Each cell in the grid was now attached to a list of constraints. For example, the upper left cell is attached to three ‘Standard9’ constraints. Each of these constraints is tied to nine grid cells and when asked to fill in a digit, goes into the grid and ‘writes in’ the digit, then erases that as a possibility for the remaining eight spaces. If while doing this, it finds a space can no longer have any digits, then we know there’s a problem with this assumption and we undo our changes and try something else. If erasing a digit leaves exactly one possibility, then we recursively treat that as a new assumption and continue from there.

In addition to the standard nine digit constraints, we then add in our variants. A space that only allows even digits has a very simple constraint. If you try to fill in an odd digit it returns a fail. It’s actually a little more efficient than that, but that’s the gist of it. For a constraint like the ‘magic 5’ we put in for Atlantis, every cell has that constraint which ‘know’ the adjacent spaces and, if a 5 is filled in, we check above; likewise, if a 6-9 is filled in, we check below.

From our perspective, generating, solving and validating puzzles are all based on the same process. When generating, first set up a grid and say that it’s possible for each cell to have the digits 1-9 in it. Next, we apply the constraints for each cell. For variants, we use an app to lay out the constraints for the puzzle. Maybe we want these three spaces to have only even digits and these three to have only odd. These constraints, when applied, remove the odd/even digits as possibilities from the cells they’re in.

Generating a puzzle always starts by picking a random spot and randomly selecting one of the possible digits. When we put that digit in, we then remove that digit as a possibility from the other cells in that row, column and box, as well as anything else the constraints for that cell may require.

We then pick another cell and try one of the possible digits there. If it leads to a contradiction, that is some cell now has all possible digits erased, we know that digit can’t be placed there so we try another. If we run out of digits to try, we then know the previous assumption was wrong. This process continues until we’ve placed digits in every space in the grid. Normally, this process is very fast, although there have been times when we’ve messed things up in the layout and the puzzle is impossible to generate. As an example, if we accidentally put five ‘even only’ constraints in a column, the puzzle is impossible to generate.

Once we have a puzzle generated, it’s time to start erasing. The goal here is to erase enough digits so that the puzzle still has exactly one solution – and isn’t too incredibly hard. A normal Sudoku can go down to about sixteen ‘givens’ (the digits pre-printed in the puzzle) before it becomes ambiguous. Our puzzles can get down to eight digits, although they’re likely to be too difficult to be any fun to solve.

The process for erasing is always the same. We randomly pick a space and erase it, then feed the resultant puzzle into our solver. The solver then uses the constraints and finds ALL possible solutions for the puzzle. If it finds more than one solution, then we know that erasing the cell made the puzzle ambiguous. Ambiguous is bad – when you solve a puzzle the solution you get should always agree with the one in the back of the book.

As a side effect of the solving, we also determine how hard the puzzle is. When erased, we go through and reduce the cells to having exactly the possible set of digits that can go into the cell. We apply a formula to these cells. A cell that has only one possibility (even though it’s erased) has a very small weight towards the difficulty; a cell that has nine possibilities puts in a very large weight. Since we prefer to have the ‘easier’ puzzles first in the book, ending up with the toughest, this gives us a way to measure that.

Once the puzzle is done and saved into our database, it’s time to publish. Our constraints play another part here, since the other effect the interesting variants have – such as even and odd – is that we usually drop a graphic into the cell effected. The ‘even’ constraint drops a faded “E” into the even cells, ‘odd’ drops a faded “O” into the cell.

Before we publish, though, we run a validation step. While it’s ‘impossible’ for bad puzzles to be generated, we still do another check of every puzzle. Why? Because sometimes coding changes are made here and there that have unexpected side effects.

Some people call them ‘bugs’, but side effects just sounds better.