A couple of years ago I was introduced to functional programming. At first it sounded bizarre and totally impractical, how could you ever program anything without using variables and state!
But the more I read the more intriguing the concept sounded. In a purely functional language there are no side effects. That means that if you call a function with the same parameters it will always return the same result. So coding in this style would reduce bugs and the need for unit tests because functions don’t depend on state. And that’s the reason I started learning Haskell.
I found this Learn you a haskell to be a great introduction to Haskell. And I also bought Real World Haskell a book which focuses on solving real world problems instead being purely academic. Which what some claim is the problem of functional programming.
Anyhow I wanted to get my hands dirty and see if I could do a little project in Haskell using my current knowledge. And a Sudoku solver seemed like the natural choice because there is a lot of lists involved. And lists are the default data structures in FP so its good practice.
But first a word of warning: I’m not a good Haskell programmer, and even though my solver works I don’t think this is how a professional Haskell programmer would do it. If you read this post and you can improve my code please leave me a message. Your help is appreciated.
To solve Sudoku’s we must first define what the puzzle is. Sudoku’s are 81 cells and each cell belongs to 3 elements: a column, row and block.
The image above depicts these elements. Blue represents block 0, Red column 4 and Green row 4. Note that I start counting at 0 you’ll see why later. The goal of Sudoku is to put the numbers 1 to 9 in each element. Human’s have developed different strategies such as Hatching, Scanning and nishio. But they are all variants on the same solution.
To solve a cell we need all of the cells possibles. We get the possibles by first getting the impossibles: Lets take cell 5 of the Sudoku puzzle in the image above. Its the empty cell in row 0 next to the 7. The impossibles of a cell are the numbers already filled in in that cells row, column and block. Lets take the block impossibles first: these are [1, 5, 7, 9]. The impossibles for its row are [3, 5, 7] and for its column [3, 5, 9]. If we combine all the impossibles of cell 5 we get [1, 3, 5, 7, 9] duplicates excluded.
Now that we have the impossibles we can get the possibles. The possibles are all numbers from 1 to 9 that are not in the impossibles list. That means the possibles for cell 5 are [2, 4, 6, 8]. All we need to do now is pick the right number from our possibles list. We do this by comparing each possibility to all the possibilities of each of the cells elements. If we find that a possibility of the cell we are trying to solve is unique in one of its elements. Then we have found the correct number. I know that this sounds confusing so I’ll show you a detailed example.
Lets take cell 5 again as our example, we know its possibles are [2, 4, 6, 8] lets see if any of these numbers are unique possibilities in the Block, Row or Column.
The column of cell 5 is column 5 and column 5 has the following positions: [5, 14, 23, 32, 41, 50, 59, 68, 79]. But because we are solving cell 5 we don’t need to get cell 5′s possibles. So we must check the possibles of cell [14, 23, 32, 41, 50, 59, 68, 79]. And if any of these cells have the same possibles as cell 5. That possible is not a unique solution anymore so we can’t say without certainty that that number is the solution.
So lets see if any of the possibles are unique possibles in the column. Our possibles are [2, 4, 6, 8] the possibles of cell 14 are [2, 4]. That means our [2, 4] are no longer unique so there out. That leaves use with [6, 8]. So lets check them against the possibles of cell 32, those are [1, 4, 7] this time there not in cell 5′s possible list so nothing drops.
Now on to cell 50 because 41 is not an empty cell. 50′s possibles are [1, 4] again nothing drops. Cell 59 is  so nothing drops again but cell 77 is different those possibles are [2, 6]. That means that cell 5′s [6, 8] drops number 6 leaving only 8. And that means 8 is the only possibility left. So 8 must be the number belonging in cell 5. Logical deduction is what they call it.
Now if the number 8 had not been a unique possibility in the column the same process would have been repeated in the block and row of cell 5. Had there been no unique possibility then no solution could be found. Then we would leave the cell open and try the next cell. As the grid got fuller the number of possibles would have been reduced. And then it could be solved at a later point in time.
And that’s how we solve a Sudoku using logic alone. This process is far to tedious for us humans because we can see within a second that the answer is 8. This is because our eyes cross hatch the number 8 . But because my computer has no eyes and I would have no clue on how to program a computer to do this. This method will have to do.
And as a side note my solver won’t ever guess an answer by using random numbers. The upside is that will never have to check if the puzzle is still integer. The downside is that it can only solve satisfactory puzzles. But since Sudoku is a game of logic using random numbers is a blasphemy anyway.
Converting the solution to Haskell
The sudoku grid
First of all we are going to need some way to get all the locations from a block, column and row. We are going to need them to get all the possibles and impossibles for a cell.
And remember that the grid has 80 cells starting at 0 and that each cell is a member of a column, block and a row.
Lets start with the column:
Int -> [Int]
column n = rest n 0
where rest _ 9 = 
rest n i = n : rest (n + 9) (i + 1)
Column takes an Int and returns a list of Int’s. The function is pretty simple just add n to a list an then add (n+9) to the list and repeat this 8 times. So if you evaluate column 0 it evaluates to 0:(9:(:18:(27:(36:(45:(54:(63:(72:())))))))) which is an alternative for [0, 9, 18, 27, 36, 45, 54, 63, 72].
Row is pretty easy to:
row :: Int -> [Int]
row n = startpoint : rest startpoint 1
where startpoint = (n * 9)
rest _ 9 = 
rest n i = n + i : rest n (i + 1)
The formula for row is: the row you want times 9 as the first item in the list. And then add (n + i) to the list 8 times. Lets take row 7 as an example. 7 * 9 = 72 so 72 will be the start point of the list. Then it adds 71 + 1, 72 + 1, 73 + 1 etc to the list. And this results in [71, 72, 73, 74, 75, 76, 77, 78, 79, 80].
Block is a bit harder:
block :: Int -> [Int]
block n = rest startpoint 0 0 0
where startpoint | n <= 2 = n * 3
| n <= 5 = 27 + (n * 3 - 9)
| n <= 8 = 54 + (n * 3 - 18)
rest _ _ _ 9 = 
rest startpoint x n i = number : rest startpoint nextX nextN (i + 1)
where number = startpoint + n + x
nextX = if n == 2 then x + 9 else x
nextN = if n == 2 then 0 else n + 1
To calculate the cells in a block we need the blocks startpoint. The startpoint is the up most left cell in the block. The startpoint for block 0 is 0 and for block 1 its 3 and block 3 is 6. So the first 3 blocks has the formula: n*3 for the startpoint. The blocks more than 2 but less than 5 need a adjustment of plus 27. So does more than 5 but less than 8. I chose to use guards to implement this. It keeps things clean in my opinion. Now that we have the startpoints for each block we need to get the other cells. We need to get (startpoint + 1) (startpoint + 2) (startpoint + 3) for the first row of the block. For the second row we need (startpoint + 9) (startpoint + 9 + 1) and (startpoint + 9 + 2). The third row is (startpoint + 18) (startpoint + 18 +1) and (startpoint + 18 +2). But instead of describing it overly verbose as I just did I used recursion. Each time number is (startpoint + n + x) where x is the adjustment for the row and n is the column of the block. When n == 2 the row moves up by adding 9 and n is reset to 0. And this a total of 9 times to get every number in the block. Now that we can query for every location of a block, row and column. We need to be able to get the column, row and block for a cell. So I made 3 functions to search for a cell in a block, column or row. These functions take a cell number and return the element the cell belongs to. Lets start with searchBlock:
searchBlock :: Int -> [Int]
searchBlock x = search x 0
where search _ 9 = error "Could not find block"
search x n = if x `elem` blk then blk else search x (n + 1)
where blk = block n
SearchBlock takes an Int and returns a list of Ints. The list of Ints is the block of the cell. The function again is recursive and tries to find the cell number in each block. The block function we wrote before is used by this function to generate the blocks. If the cell is an element of the current block than it will return that block. If its not an element it calls itself with (n + 1) so the next block is checked for the cell. And because there are only 8 blocks. The function will fail if n equals 9 and raises an error.
Note that I use the infix notation by using back ticks on the function elem. The function elem checks if the first parameter is part of the second parameter and returns a Bool.
searchRow is almost identical:
searchRow :: Int -> [Int]
searchRow x = search x 0
where search _ 9 = error "Could not find row"
search x n = if x `elem` rw then rw else search x (n + 1)
where rw = row n
And so is searchColumn:
searchColumn :: Int -> [Int]
searchColumn x = search x 0
where search _ 9 = error "Could not find column"
search x n = if x `elem` col then col else search x (n + 1)
where col = column n
You could argue that I should have written a function that would take an element as an argument. That way we’d only have 1 function in stead of 3. The reason I did not do this is clarity, like I said before I’m not a good Haskell programmer. This way I’ll understand my own code after a year.
Impossibles and Possibles
The next step is to be able to calculate all the possibles for a cell. And in order to do that we need all the impossibles. But first we need a way to get a cells block, row and column. We’ll create a function that uses searchRow, searchColumn and searchBlock and combines their results.
This function is named searchCell:
searchCell :: Int -> [Int]
searchCell n = sort $ nub $ rw ++ col ++ blk
where rw = searchRow n
col = searchColumn n
blk = searchBlock n
As you can see the function takes n and uses that to add (++) the results of the search functions. You’ll also see sort and nub. Nub apparently means “essence” and it removes duplicates. Sort is used to make the result more appealing for us humans. For some reasons sorted list are easier to debug in my case.
Now that we can get the locations of importance for solving a cell. We can use these locations to find all impossibles. Remember that the impossibles are all the values that are already filled in in a cells row, column and block.
But before I explain “impossibles” I must add that to test my code I created a variable called grid. Grid is a list of 80 elements representing the puzzle:
grid = [5, 3, 0, 0, 7, 0, 0, 0, 0,
6, 0, 0, 1, 9, 5, 0, 0, 0,
0, 9, 8, 0, 0, 0, 0, 6, 0,
8, 0, 0, 0, 6, 0, 0, 0, 3,
4, 0, 0, 8, 0, 3, 0, 0, 1,
7, 0, 0, 0, 2, 0, 0, 0, 6,
0, 6, 0, 0, 0, 0, 2, 8, 0,
0, 0, 0, 4, 1, 9, 0, 0, 5,
0, 0, 0, 0, 8, 0, 0, 7, 9]
The 0′s in grid represent empty cells that need to be solved.
The function impossibles:
impossibles :: Int -> [Int] -> [Int]
impossibles n puzzle = if zero then getImp else [1..9]
where zero = isZero n puzzle
getImp = sort $ nub $ filter (/= 0) (getValues puzzle (searchCell n) 0)
where getValues  _ _ = 
getValues (x:xs) cells i = if i `elem` cells then x : nextVal else nextVal
where nextVal = getValues xs cells (i + 1)
This is the first function that actually does something with the puzzle. The puzzle parameter is a reference to the puzzle in its current state. The impossibles function also checks if the cell is 0 or not. If the cell is zero it will get its impossibles. If the value of the cell is not zero then its already filled in and that means every number from 1 to 9 is impossible.
The getImp function does the actual work. It takes the locations from searchCell and finds the values for these locations in the puzzle. This is done by looping through the puzzle with a counter: i that increments each time we loop. If i is part of cells the x should be included in the list. Otherwise just ignore it and skip x. If the puzzle is empty will end the list by appending .
The function also sorts and nub’s the list that is returned.
The isZero function is almost the same as getValues from impossibles:
isZero :: Int -> [Int] -> Bool
isZero n puzzle = zero puzzle 0
where zero _ 81 = error "out of bounds"
zero  _ = True
zero (x:xs) i = if i == n then x == 0 else zero xs (i + 1)
Again we loop through the puzzle and if i == n where n is the cell. We check if that cell is 0 by checking x. If i is 81 we’re out of bounds and it will throw an error.
Now that we have the impossibles its time to get the possibles for a cell:
possibles :: Int -> [Int] -> [Int]
possibles n puzzle = delete [1..9] (impossibles n puzzle)
where delete  _ = 
delete (x:xs) imp = if not (elem x imp) then x : nextDel else nextDel
where nextDel = delete xs imp
If x from [1..9] is not in the impossibles list its a possible so add it to the list. The function continues until it runs out of numbers. If there are no impossibles every number gets added. If nothing is possible nothing gets added and it returns an empty list.
Now that we have the possibles for a cell we can start solving cells:
solveCell :: Int -> [Int] -> Int
solveCell n puzzle = solve (possibles n puzzle)
where solve  = 0
solve (v:vs) | not (v `elem` (pos searchBlock)) = v
| not (v `elem` (pos searchRow)) = v
| not (v `elem` (pos searchColumn)) = v
| otherwise = solve vs
pos f = pos (filter (/= n) (f n))
where pos  = 
pos (x:xs) = possibles x puzzle ++ pos xs
This function gets all the possibles for a cell and loops through each possible. When a possible is not a element of the possibles of a block, row or column then its a unique possible. And that means that the solution to the cell is that unique possible.
We get the possibles of an element by getting all the possibles from each cell of that element. Excluding the cell we are currently solving. Thats what the filtering on (/= n) in pos is for. The pos function adds every possible of the element together in a list.
We can now solve cells lets solve the entire puzzle:
solveSudoku :: [Int]
solveSudoku = solve grid 0
where solve puzzle 5000 = puzzle
solve puzzle i = if all (/= 0) puzzle
else solve (loop puzzle puzzle 0) (i + 1)
loop _  _ = 
loop puzzle (0:xs) n = solveCell n puzzle : loop puzzle xs (n + 1)
loop puzzle (x:xs) n = x : loop puzzle xs (n + 1)
I cheated a little here because I’ll always enter the grid value in this function. To remind you: grid is the list representing the puzzle. The solve function tries to solve the puzzle at least 5000 times. The solver can only solve satisfactory puzzles meaning without randomly guessing. So if the solver can’t solve it in under 5000 times it must be a broken puzzle. If that is the case we’ll just display how far we could get. The puzzle stops solving if all the elements are not 0. Because that means there are no empty places and therefore we solved the puzzle. If its not solved we’ll call loop: a function that loops through each cell and tries to solve it.
Loop has two puzzle’s as parameters. One to keep track of the whole puzzle which the solveCell function needs. And the other to loop recursively through. If the value of the cell is 0 we’ll try and solve it and append the solution to the list. Remember that the solution can also be 0. If the value of the cell is not 0 in the function loop. Then that cell is solved already so we’ll just add it to the list. If the whole puzzle is looped through will return the list to the solve function.
solveSudoku returns a plain old list. And that is not very readable. So I decided to create a prettyPrint function to address the problem. Now this function is very hairy and I’m the first person to admit it, but it works.
prettyPrint :: IO ()
prettyPrint = do
myprint solveSudoku 0
where myprint  _ = do
myprint grid 3 = do
myprint grid 0
myprint grid n = do
print (line (take 9 grid) "|" 0)
myprint (drop 9 grid) (n + 1)
line  str _ = str ++ " |"
line x str 3 = line x (str ++ " |") 0
line (x:xs) str n = line xs (str ++ " " ++ (show x)) (n + 1)
verticalLine = replicate 25 '-'
The function returns IO () so its not a pure function. It starts by printing a verticalLine of 25 ‘-’ characters. Then it will call myprint, a function that needs a solved sudoku and a number to keep track of the iterations. If the puzzle is empty it prints a vertical line to “close” the box. If the iteration is 3 will draw another vertical line and reset the iterator. If the iterator is not 3 will print a line.
The line function takes a list and prints each element of that list. But every 3 prints it will add a “|” character creating the blocks of the sudoku puzzle. At the end of the line functions recursion it adds one final pip.
Note that “myprint grid n” uses take and drop to iterate through the sudoku puzzle.
And the result of prettyPrint is:
"| 5 3 4 | 6 7 8 | 9 1 2 |"
"| 6 7 2 | 1 9 5 | 3 4 8 |"
"| 1 9 8 | 3 4 2 | 5 6 7 |"
"| 8 5 9 | 7 6 1 | 4 2 3 |"
"| 4 2 6 | 8 5 3 | 7 9 1 |"
"| 7 1 3 | 9 2 4 | 8 5 6 |"
"| 9 6 1 | 5 3 7 | 2 8 4 |"
"| 2 8 7 | 4 1 9 | 6 3 5 |"
"| 3 4 5 | 2 8 6 | 1 7 9 |"
And thats how I made a Sudoku solver in Haskell. There are probably a million ways to speed up my solver, so if you spot any please let me know.