## Spell checking with F#

Some months ago I have written an introductory article on F# for the "IoProgrammo" magazine (sorry, Italian only!) and now I have published a second article on the latest issue of the same magazine.

This new article covers more advanced topics and is focused on writing a basic spell checker that mixes together the functional and object-oriented programming paradigms.

The spell checking algorithm is implemented in functional F# and is based on the Jaro-Winkler similarity distance while the UI is WPF-based and written with OO code.

I hope you will appreciate the article and I'll be very happy to get any feedback from the readers.

## F# introductory article on IoProgrammo magazine

If you understand Italian, you may be interested in reading an introductory article on F# I wrote for the most important Italian programming magazine called "*IoProgrammo*" and that was published a few days ago in the August issue.

The article covers the very first steps with F#, from installation to writing a first working sample application, which can be used to encrypt/decrypt messages using the Vigenere cipher.

I hope you can find useful, if you have any comments I'll be glad to hear from you.

## Il mio articolo su IoProgrammo

E' uscito da pochi giorni il numero di Agosto di IoProgrammo, la rivista di informatica più importante in Italia, e al suo interno, per la prima volta, è possibile leggere **un articolo del sottoscritto dedicato ad F#**, il nuovo linguaggio di programmazione funzionale di Microsoft.

L'articolo è lungo 5 pagine ed ha un taglio introduttivo, allo scopo di diffondere il **paradigma di programmazione funzionale**. Non manca comunque una prima applicazione di esempio, che si occupa di cifrare e decifrare dei messaggi sfruttando il Cifrario di Vigenere.

L'argomento è sicuramente attuale ed interessante, tanto che sto già lavorando ad **un secondo articolo** che tratti dei concetti più avanzati.

Mi auguro che lo possiate apprezzare e sarò lieto di ricevere i vostri commenti.

## Project Euler in F# – Problem 28

Sometimes, when I find a problem like Project Euler problem 28, I remember that human beings are smarter than computers.

Let's read the problem statement, it doesn't seem easy at first glance:

Starting with the number 1 and moving to the right in a clockwise direction a 5 by 5 spiral is formed as follows:

21 22 23 24 25

20 7 8 9 10

19 6 1 2 11

18 5 4 3 12

17 16 15 14 13

It can be verified that the sum of both diagonals is 101.

What is the sum of both diagonals in a 1001 by 1001 spiral formed in the same way?

If we don't think of anything *creative*, we are forced to write an algorithm to draw that 1001x1001 number spiral and then sum both diagonals together.

However, we can start noticing that the number in the top-right position is always *n*^2, where *n* is the length of one side of the spiral. In the example, the spiral is 5 by 5 and the top-right value is 25. In a 3 by 3 spiral it would be 9, 49 in a 7 by 7 spiral and so on.

If we concentrate on the outer ring of the spiral, then the inner sum can be obtained by recursion, so we only need to find a trick for the top-left, bottom-left and bottom-right numbers.

They are 21, 17 and 13 in a 5x5 spiral, so it seems like they can be computed starting from the top-right number and repeatedly subtracting 4 (i.e. *n - 1*).

Let's check this theory with the 3x3 spiral: the top-right number is 9, the other corners are 7, 5 and 3. We are indeed subtracting 2 (i.e. *n - 1*) each time!

Let's summarize: given a *n* x *n* spiral we have to sum *n^2*, *n^2 - (n-1)*, *n^2 - 2(n-1)* and *n^2 - 3(n-1)*.

Some elementary math and we get *4n^2 - 6n + 6* for each ring of the spiral.

The base case of the recursion is the 1x1 spiral and in that case the (degenerate) diagonal sum is 1.

Given these premises it is very easy to write the F# code to implement the algorithm:

let rec euler28 n = match n with | 1 -> 1 | _ -> euler28 (n-2) + (4 * n * n) - (6 * n) + 6

An improvement of the code above would be using *tail recursion*, which means that the last operation of the function should be a recursive call. This leads to a reduction of the stack space used from *O(n)* to *O(1)* and let us avoid stack overflow exceptions:

let tail_euler28 n = let rec spiral n sum = match n with | 1 -> sum + 1 | _ -> spiral (n - 2) (sum + (4 * n * n) - (6 * n) + 6) spiral n 0

As it is usually done in *tail recursion*, we define an *accumulator* parameter (*sum* in the example) which is used to store the partial result of the computation, eliminating the need to keep the state on the stack.

In both solutions I've left a bug that can lead to an infinite loop. Can you spot it?

## Luhn algorithm in F#

The Luhn algorithm is a simple error detection formula based on the modulus operator which is widely used to validate credit card numbers or Canadian Social Insurance Numbers.

This checksum function can detect any single-digit error and operates verifying the number against its included check digit.

The algorithm is made of three steps:

1) starting from the rightmost digit and moving left, double the value of every second digit (or digits in even positions);

2) sum all resulting digits together with the original ones that were untouched;

3) if the result is a multiple of 10 (i.e. the result modulus 10 is zero) then the input value is valid.

This simple algorithm can be implemented with a few F# lines of code:

#light open System let double_digit n = let double = n * 2 if (double > 9) then double - 9 else double let rec luhn_loop isEven acc input = match input with | [] -> acc % 10 = 0 | x :: xs -> let num = Int32.Parse(x.ToString()) if (isEven) then luhn_loop (not isEven) (acc + (double_digit num)) xs else luhn_loop (not isEven) (acc + num) xs let luhn n = n.ToString() |> Seq.to_array |> Array.rev |> Seq.to_list |> luhn_loop false 0

I separated from the main loop the *double_digit* function, which takes a digit and multiplies it by two. If the result has two digits (i.e. it is greater than 9) with sum together the two digits (i.e. subtract 9 from the number), otherwise we simply return it.

The *luhn_loop* function iterates over the list of digit that is obtained by taking the original number, converting it into an array of digits and reversing it.

Inside each iteration we take into account a single digit, double it if it is placed in an even position and sum it to the accumulator.

When the list of digits to be examined is empty we are done with the loop and we just have to check if the value stored in the accumulator is evenly divisible by 10.

## Project Euler in F# – Problem 52

After a long break, let's resume with the **Project Euler** series.

The next one to be solved is Problem 52:

It can be seen that the number, 125874, and its double, 251748, contain exactly the same digits, but in a different order.

Find the smallest positive integer, x, such that 2x, 3x, 4x, 5x, and 6x, contain the same digits.

As usual, we have to break the problem into small pieces and solve these pieces one by one.

As in many other Project Euler problems we have to extract the digits of a number, and this is done by converting the number into a string (i.e. an array of characters) and then casting the characters to integers.

We also need a function to check if two numbers are made of the same digits. For each number we create a set from the sequence of its digits, in order to ignore the order of the items and discard duplicates. Then we apply the standard set comparison operator to get the result.

We have to multiply each positive integer by 2, 3, 4, 5 and 6, so we can define the *multiples* function, which takes an integer *x* and returns a list containing five elements, corresponding to *2x*, *3x*, *4x*, *5x* and *6x*.

Thanks to these three functions, checking if a number and its multiples contain exactly the same digits is very easy. In fact, we generate the list of multiples and test all of them with the *same_digits* function.

The algorithm to get the answer to the problem is the following: generate an infinite list of integers (remember *seq_unfold*?), test each number with the *check* function and return the first value that passes the test.

#light let digits n = n.ToString() |> Seq.map (string >> int) let same_digits a b = Set.equal (Set.of_seq (digits a)) (Set.of_seq (digits b)) let multiples n = [2 .. 6] |> Seq.map (fun x -> n * x) let check n = n |> multiples |> Seq.for_all (same_digits n) let answer = 1 |> Seq.unfold (fun i -> Some (i, i + 1)) |> Seq.first (fun x -> if check x then Some (x) else None)

There is also a well-known (and tricky) solution for this problem, which was presented in the book "*The man who calculated*". Did you know it?

## Brainfuck interpreter in F#

Brainfuck is an *esoteric programming language* whose grammar consists of only eight commands, written as a single character each.

Besides this minimalistic structure, the language is *Turing-complete* but writing (and reading) Brainfuck code is very hard.

The Brainfuck machine uses a finite-length tape (usually 30000 byte cells long) and two pointers: an instruction pointer and a data pointer.

The former scans the source code and executes one instruction at a time, while the latter is used to increase or decrease the value of the cell that it is pointing.

This structure can be represented by the following F# type, called *BFState*:

type BFState = { mutable program : string ; // program being interpreted mutable memory : int[] ; // memory mutable pc : int ; // current program counter mutable pos : int } // current pointer position

We define then a function to initialize our Brainfuck machine given the size of the memory tape and the source code, that basically zeroes all memory cells and the two pointers:

let initState memSize code = { program = code; memory = Array.zero_create memSize; pc = 0; pos = 0; }

We reach the end of the program when the program counter steps beyond the end of the code (i.e. the length of the string containing the source):

let isEnd (state : BFState) = state.pc >= state.program.Length

The following two functions are used to move the program counter one step towards the end or the beginning of the source code:

let nextCommand (state : BFState) = { program = state.program ; memory = state.memory ; pc = state.pc + 1 ; pos = state.pos } let previousCommand (state : BFState) = { program = state.program ; memory = state.memory ; pc = state.pc - 1 ; pos = state.pos }

*getMem* and *setMem* are two auxiliary functions that can be used to retrieve or set the value of the memory cell pointed by the data pointer:

let getMem (state : BFState) = state.memory.[state.pos] let setMem (state : BFState) value = state.memory.[state.pos] <- value state.memory

Similarly, we have two functions to read and write from the console, which is the behavior of the ',' and '.' commands respectively:

let outputByte (state : BFState) = Console.Write (Convert.ToChar(state.memory.[state.pos])) nextCommand state let readByte (state : BFState) = state.memory.[state.pos] <- Console.Read() nextCommand state

When the command to perform is '[', if the byte at the data pointer is zero we have to jump to the first matching ']' character following the current position:

let rec moveToForwardMatch (state : BFState) = match state.program.[state.pc] with | ']' -> (nextCommand state) | _ -> moveToForwardMatch (nextCommand state)

The complementary command is ']', that goes back to the matching '[' character when the byte at the data pointer is non-zero:

let rec moveToPreviousMatch (state : BFState) = match state.program.[state.pc] with | '[' -> (nextCommand state) | _ -> moveToPreviousMatch (previousCommand state)

We have now all the tools required to parse Brainfuck code, so we only need to write the logic to select the action to perform according to the command analyzed in a single step.

Each command takes a *BFState* and returns a new *BFState*. Any character different from the eight allowed will be simply ignored:

let step (state : BFState) = match state.program.[state.pc] with | '+' -> { program = state.program ; memory = setMem state (getMem state + 1) ; pc = state.pc + 1 ; pos = state.pos } | '-' -> { program = state.program ; memory = setMem state (getMem state - 1) ; pc = state.pc + 1 ; pos = state.pos } | '<' -> { program = state.program ; memory = state.memory ; pc = state.pc + 1 ; pos = state.pos - 1} | '>' -> { program = state.program ; memory = state.memory ; pc = state.pc + 1 ; pos = state.pos + 1} | '[' -> if (state.memory.[state.pos] = 0) then moveToForwardMatch state else nextCommand state | ']' -> if (state.memory.[state.pos] <> 0) then moveToPreviousMatch state else nextCommand state | '.' -> outputByte state | ',' -> readByte state | _ -> nextCommand state // ignore any non-command character

The parser, which could be the only publicly available function, will take care of initializing the state and recursively step through the code, one instruction at a time, until the end of the source:

let parse memSize code = let rec run (state : BFState) = if (not (isEnd state)) then run (step state) initState memSize code |> run

You can find a repository of Brainfuck sample programs at this site: http://esoteric.sange.fi/brainfuck/bf-source/prog/, the following code contains the classic *Hello World!* and how to parse it using the interpreter we just wrote:

let code = ">+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.>>>++++++++[<++++>-]<.>>>++++++++++[<+++++++++>-]<---.<<<<.+++.------.--------.>>+." parse 30000 code

The whole Brainfuck interpreter code can be found just below, feel free to take it and dissect it:

#light open System type BFState = { mutable program : string ; // program being interpreted mutable memory : int[] ; // memory mutable pc : int ; // current program counter mutable pos : int } // current pointer position let initState memSize code = { program = code; memory = Array.zero_create memSize; pc = 0; pos = 0; } let isEnd (state : BFState) = state.pc >= state.program.Length let nextCommand (state : BFState) = { program = state.program ; memory = state.memory ; pc = state.pc + 1 ; pos = state.pos } let previousCommand (state : BFState) = { program = state.program ; memory = state.memory ; pc = state.pc - 1 ; pos = state.pos } let getMem (state : BFState) = state.memory.[state.pos] let setMem (state : BFState) value = state.memory.[state.pos] <- value state.memory let rec moveToForwardMatch (state : BFState) = match state.program.[state.pc] with | ']' -> (nextCommand state) | _ -> moveToForwardMatch (nextCommand state) let rec moveToPreviousMatch (state : BFState) = match state.program.[state.pc] with | '[' -> (nextCommand state) | _ -> moveToPreviousMatch (previousCommand state) let outputByte (state : BFState) = Console.Write (Convert.ToChar(state.memory.[state.pos])) nextCommand state let readByte (state : BFState) = state.memory.[state.pos] <- Console.Read() nextCommand state let step (state : BFState) = match state.program.[state.pc] with | '+' -> { program = state.program ; memory = setMem state (getMem state + 1) ; pc = state.pc + 1 ; pos = state.pos } | '-' -> { program = state.program ; memory = setMem state (getMem state - 1) ; pc = state.pc + 1 ; pos = state.pos } | '<' -> { program = state.program ; memory = state.memory ; pc = state.pc + 1 ; pos = state.pos - 1} | '>' -> { program = state.program ; memory = state.memory ; pc = state.pc + 1 ; pos = state.pos + 1} | '[' -> if (state.memory.[state.pos] = 0) then moveToForwardMatch state else nextCommand state | ']' -> if (state.memory.[state.pos] <> 0) then moveToPreviousMatch state else nextCommand state | '.' -> outputByte state | ',' -> readByte state | _ -> nextCommand state // ignore any non-command character let parse memSize code = let rec run (state : BFState) = if (not (isEnd state)) then run (step state) initState memSize code |> run

## Facebook FizzBuzz

Have you ever tried looking for "**FizzBuzz**" on a search engine?

If you do, you'll surely land on this page on Coding Horror, the blog written by **Jeff Atwood**.

To make a long story short, Jeff states that the vast majority of the developers is unable to write a tiny program that should take no more than 10 minutes to code.

This is the *famous* FizzBuzz problem:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

You should be outraged now, if you aren't I hope you don't make a living as a developer!

I found out that Facebook also considers FizzBuzz as a good starting test for job applicants. There is a programming puzzles page, where you can find a set of good problems, ranging from blindingly easy to very hard.

The first one is called HoppityHop! and it is just a variation of the good old FizzBuzz:

The program should iterate over all integers (inclusive) from 1 to the number expressed by the input file. For example, if the file contained the number 10, the submission should iterate over 1 through 10. At each integer value in this range, the program may possibly (based upon the following rules) output a single string terminating with a newline.

* For integers that are evenly divisible by three, output the exact string Hoppity, followed by a newline.

* For integers that are evenly divisible by five, output the exact string Hophop, followed by a newline.

* For integers that are evenly divisble by both three and five, do not do any of the above, but instead output the exact string Hop, followed by a newline.

Being a developer myself I couldn't resist writing a solution for this problem, obviously in F#:

#light let HoppityHop n = let printHop x = match x with | x when (x % 15 = 0)-> printfn "Hop" | x when (x % 3 = 0) -> printfn "Hoppity" | x when (x % 5 = 0) -> printfn "Hophop" | _ -> () [1 .. n] |> Seq.iter printHop

I know many of you will feel the urge to suggest a better solution, you're welcome, the comment area is yours to use!

## Random testing in F# with FsCheck

One of the emerging trends in software development is **TDD** or **Test-Driven Development**, a methodology based on writing tests first and then coding in order to pass the tests.

Besides unit testing libraries inherited by the .Net Framework, F# can now count on FsCheck, a random testing framework cloned from Haskell's QuickCheck.

A random testing library generates a set of test cases and attempts to falsify the properties defined by the developer.

This approach is not exhaustive but if all the tests of large enough test suite are passed we can safely assume that the code is correct.

Let's see how to get started using FsCheck to validate the RSA implementation written some time ago.

The first step is to download the latest release (currently 0.3) of FsCheck from the Download page.

You can either get the binaries or the source code, that also contains an example console application.

Assuming that you downloaded the binaries, you have then to open/create an F# application and add a Reference (*Project - Add Reference*) to the FsCheck.dll library file:

In order to use the methods provided by FsCheck we have to include its namescope into our code by adding a *open FsCheck* clause at the beginning of our program.

For the sake of example, let's fix the two distinct random prime numbers *p* and *q* and the public exponent *e*.

We have then to define our first property, that I'm going to call *prop_rsa*, which basically asserts that decrypting a message that was previously encrypted we get back the original message.

In order to run a property *prop_myproperty* we just have to run *quickCheck myproperty*, so in this case we'll run *quickCheck prop_rsa*.

#light open System open FsCheck // RSA sample data let p = 61 let q = 53 let e = 17 let n = p * q let d = private_exponent e p q let prop_rsa message = let encrypted = encrypt message e n decrypt encrypted d n = message quickCheck prop_rsa Console.ReadKey() |> ignore

If everything goes well, we should see a console window like the following one, with the number of tests passed (by default, FsCheck generates 100 test cases).

If a test fails, FsCheck will stop the execution and show which test failed. If you want to see all the generated test cases, you can run *verboseCheck* instead of *quickCheck*.

The next step should be writing a custom generator in order to generate not only the message but also the prime numbers and the public exponent, but I think we can cover that in a future post.

You can **download the complete solution** here.

## Project Euler in F# – Problem 55

As in the last post, in the description of Project Euler problem 55 there is a detailed step-by-step solution of the problem itself and we just have to write the F# code to perform the tasks.

The only thing that I really had to understand is the definition of *Lychrel numbers*:

If we take 47, reverse and add, 47 + 74 = 121, which is palindromic.

Not all numbers produce palindromes so quickly. For example,

349 + 943 = 1292,

1292 + 2921 = 4213

4213 + 3124 = 7337That is, 349 took three iterations to arrive at a palindrome.

Although no one has proved it yet, it is thought that some numbers, like 196, never produce a palindrome. A number that never forms a palindrome through the reverse and add process is called a Lychrel number. Due to the theoretical nature of these numbers, and for the purpose of this problem, we shall assume that a number is Lychrel until proven otherwise. In addition you are given that for every number below ten-thousand, it will either (i) become a palindrome in less than fifty iterations, or, (ii) no one, with all the computing power that exists, has managed so far to map it to a palindrome. In fact, 10677 is the first number to be shown to require over fifty iterations before producing a palindrome: 4668731596684224866951378664 (53 iterations, 28-digits).

Surprisingly, there are palindromic numbers that are themselves Lychrel numbers; the first example is 4994.

How many Lychrel numbers are there below ten-thousand?

As usual, it is suggested to break down the problem into little pieces that can be easily solved with a few lines of code each.

First, we need a function to reverse an integer and another one to check whether a number is palindromic, i.e. it is equal to its reverse.

Then we define a function that will be used during each iteration to take a number and sum it with its reverse. This new number is the one that will be checked next for being a palindrome inside the *isLychrel* function.

If we found a palindrome then the number under testing is not a *Lychrel*, otherwise we can start the next iteration up to a maximum of 50 times. If we hit this limit then we can conclude the number is a *Lychrel*.

We have now all the pieces required to answer the original question. Just take all numbers smaller then 10000, filter the *Lychrel* ones and count them:

#light let reverse n = n.ToString().ToCharArray() |> Array.rev |> Array.map (fun x -> x.ToString()) |> Array.reduce_left (^) |> bigint.Parse let isPalindromic n = n = reverse n let reverseAndAdd n = n + reverse n let isLychrel n = let rec isLychrelIter n i = match i with | 50 -> true | _ -> let r = reverseAndAdd n if isPalindromic r then false else isLychrelIter r (i+1) isLychrelIter n 1 let answer = [1I .. 10000I] |> Seq.filter (fun x -> isLychrel x) |> Seq.length