Claudio Cherubino's blog Life of a Googler

26Aug/091

Functional programming interview question

I think that examining the hiring process of a company you can understand a lot of what would be working there.

As Joel Spolsky wrote, you should only hire people who are Smart and Get Things Done and a good way to be sure that a candidate belongs to this category is testing his/her skills with a good programming exercise, one easy enough to be solved in 15 minutes but that requires the use of brain.

Starling Software clearly describes its interview process on a page of its website and proposes a couple of sample programs for the potential applicants. In the first problem you are asked to use Haskell to process a given file (obviously we will use F#):

The file input.txt contains lists of words, one per line, in two categories, NUMBERS and ANIMALS. A line containing just a category name indicates that the words on the following lines, until the next category name, belong to that category. Read this file as input (on stdin) and print out a) a sorted list of the unique animal names encountered, and b) a list of the number words encountered, along with the count of each. Feel free to chose your output format.

The algorithm to solve the exercise is easy: we read the file line by line and remember the current category (NUMBERS or ANIMALS) in order to add the next words to the appropriate list. When the file is over, we filter duplicates and sort the list of animals and group the numbers together with their counts.

The only problem is knowing how to manage the concept of state of the application in a functional way. In the imperative paradigm you define a variable to keep the state and change its value when you find a new category in the input file. In functional programming you don't use state variables instead you use function parameters and recursive calls:

open System
open System.IO

let animals_and_number filename =
  let rec process_line lines category animals numbers =
    match lines with
    | [] -> (animals, numbers)
    | x::xs -> match x with
               | "NUMBERS" -> process_line xs "NUMBERS" animals numbers
               | "ANIMALS" -> process_line xs "ANIMALS" animals numbers
               | x -> match category with
                      | "NUMBERS" -> process_line xs category animals (x :: numbers)
                      | "ANIMALS" -> process_line xs category (x :: animals) numbers
                      | _ -> process_line xs category animals numbers
  let all_lines = File.ReadAllLines(filename) |> Seq.to_list
  process_line all_lines "" [] []

let filename = "input.txt"
let (animals, numbers) = animals_and_number filename
let sorted_animals = animals |> Seq.distinct |> Seq.sort |> Seq.to_list
let counted_words = numbers |> Seq.countBy (fun x -> x) |> Seq.to_list

printf "Animals: %A\n" sorted_animals
printf "Numbers: %A" counted_words

The recursive process_line function has four parameters: the list of lines to be processed, the current category (initially an empty string) and the two lists of animals and numbers found so far.

For each new line we first check if it represents one of the categories. In this case we have to change state, i.e. discard the element and recursively call the same function with the correct category parameter.

If the element processed is not a category we only have to add it to the animals or number list, according to the value of the category parameter.

At the end of the animals_and_number function (when lines is empty) we return a tuple made of the two lists created.
The rest of the job is calling some standard library functions to filter duplicates, sort and count the elements of the sequences.

28Dec/080

Random testing in F# with FsCheck

One of the emerging trends in software development is TDD or Test-Driven Development, a methodology based on writing tests first and then coding in order to pass the tests.

Besides unit testing libraries inherited by the .Net Framework, F# can now count on FsCheck, a random testing framework cloned from Haskell's QuickCheck.

A random testing library generates a set of test cases and attempts to falsify the properties defined by the developer.

This approach is not exhaustive but if all the tests of large enough test suite are passed we can safely assume that the code is correct.

Let's see how to get started using FsCheck to validate the RSA implementation written some time ago.

The first step is to download the latest release (currently 0.3) of FsCheck from the Download page.

You can either get the binaries or the source code, that also contains an example console application.

Assuming that you downloaded the binaries, you have then to open/create an F# application and add a Reference (Project - Add Reference) to the FsCheck.dll library file:

Project referencing FsCheck library

Project referencing FsCheck library

In order to use the methods provided by FsCheck we have to include its namescope into our code by adding a open FsCheck clause at the beginning of our program.

For the sake of example, let's fix the two distinct random prime numbers p and q and the public exponent e.

We have then to define our first property, that I'm going to call prop_rsa, which basically asserts that decrypting a message that was previously encrypted we get back the original message.

In order to run a property prop_myproperty we just have to run quickCheck myproperty, so in this case we'll run quickCheck prop_rsa.

#light
open System
open FsCheck

// RSA sample data
let p = 61
let q = 53
let e = 17
let n = p * q
let d = private_exponent e p q

let prop_rsa message =
  let encrypted = encrypt message e n
  decrypt encrypted d n = message

quickCheck prop_rsa

Console.ReadKey() |> ignore

If everything goes well, we should see a console window like the following one, with the number of tests passed (by default, FsCheck generates 100 test cases).

FsCheck shows the number of passed tests

FsCheck shows the number of passed tests

If a test fails, FsCheck will stop the execution and show which test failed. If you want to see all the generated test cases, you can run verboseCheck instead of quickCheck.

The next step should be writing a custom generator in order to generate not only the message but also the prime numbers and the public exponent, but I think we can cover that in a future post.

You can download the complete solution here.

23Apr/087

Remove duplicate values from a List in F#

Once in a while I give a look at the web server log to find out how people landed in this blog and yesterday there was someone that entered the following search string into Google:

Given a random List of integers, write a function that removes all the duplicate values

This site appears as the first result, even if there is no solution for that problem, but I want to help you anyway, unknown visitor!

In fact, we can make use of the Set data type to easily solve the problem.

According to the definition, a Set represents a collection of elements without duplicates.
So, if we create a Set from the given list, all duplicate values will be automatically filtered.

This can be done in a single line, I'll call the function nub because this is the name of the corresponding Haskell one:

let nub xs = xs |> Set.of_list |> Set.to_list

As I said, to return a List without duplicate values we just need to create a Set from it and then create a List from the new Set.

Do you think it is fair to act this way or should I say that I'm cheating? :)

7Feb/086

Project Euler in F# – Problem 5

The exercise we are going to face today is ranked as one of the easiest of Project Euler's, so I don't think you will have any problem in solving it.

However, it allows me to discuss a little bit on algorithm optimization and how to refine a working solution to obtain a better one.

The text of the exercise says:

2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest number that is evenly divisible by all of the numbers from 1 to 20?

Some programming languages (for instance Haskell) provide a native function for evaluating the least common multiple (LCM) of two numbers, i.e. the smallest positive integer that is multiple of both numbers.

In F# we don't have this function, hence we have to write our own code to perform this calculation, and we need it to be able to find the LCM of a set of numbers and not only two.

Let's see the implementation of my algorithm:

#light
open System

let nums = 1 |> Seq.unfold (fun i -> Some (i, i + 1))

let divisibleByList list n =
    Seq.for_all (fun x -> n % x = 0) list

let answer = nums |> Seq.first (fun i -> if divisibleByList [1 .. 20] i then Some (i) else None)

My solution will simply take the integers starting from 1 and check if they are evenly divisible by all numbers from 1 to 20.

We don't know when to stop, so at line 4 we have to create an infinite list of integers.
To do so, we use the unfold function, which allows us to create a lazy list by passing another function to be repeatedly evaluated to provide the elements of the list.

In our case we start from 1 and for each iteration we add one to the previous value, without an upper limit.

At line 6 we have the definition of the divisibleByList function, which takes a list of numbers and an integer n and checks if n is evenly divisible by all the elements of the given list.

This is done by using Seq.for_all, which tests if all the elements of the sequence satisfy the given predicate.

We have now everything we need to find our solution.
Let's just feed the divisibleByList function with the elements of the infinite list of integers and stop when we find the first that satisfy our conditions, thanks to Seq.first.

The solution is complete, but it is not very fast, since it takes about 8 minutes to run in my computer.

We can improve this time by using some very simple math.
First of all, we are sure that the number we are looking for will be a multiple of 2, so we can skip all odd numbers, by changing line 4 with:

let nums = 2 |> Seq.unfold (fun i -> Some (i, i + 2))

With this simple change the execution time drops to 4 minutes, as the amount of integers to check is is actually halved.

I want to go a little further and instead of checking all even numbers we can just test the multiples of 20.
To do so, we change again line 4 with this new one:

let nums = 20 |> Seq.unfold (fun i -> Some (i, i + 20))

Now I can be proud of myself as the execution time is reduced to 25 seconds, twenty times less than the original solution!

Obviously this is not the only solution and there are smarter approaches, maybe we can see them in a future post.

18Jan/080

fsharp.it, il nuovo portale su F#

Qualche tempo fa anticipavo la mia intenzione di avviare un nuovo blog più tecnico e adesso quel momento è arrivato!

Ho aperto un portale tematico dedicato ad un tema abbastanza specialistico nel mondo dell'informatica ma che sta acquisendo sempre più popolarità.

Si tratta della programmazione funzionale ed in particolare di F#, il linguaggio basato su questo paradigma sviluppato da Microsoft per contrastare le alternative più diffuse, principalmente Erlang e Haskell.

So che per molti di voi quello che sto scrivendo non ha assolutamente senso, ma se vi occupate di informatica vi consiglio almeno di dare un'occhiata a questo nuovo mondo, magari partendo proprio dagli articoli del portale.

L'indirizzo è www.fsharp.it ed i contenuti sono scritti in lingua inglese, ma questo non può essere un freno per chi lavora nel settore.

Dategli un'occhiata e poi magari lasciate un commento, chissà che un giorno non mi ringrazierete! :wink:

15Jan/080

let title = "Hello World"

When I was attending the first year of my university course, a teacher of mine used Haskell to teach us the basics of software development.

It was amazing, functional programming makes you think differently about programming.

In the functional paradigm functions are used in their real mathematical sense.
Hence, they are only computation objects and there is no information about state or mutable data.

Functional programming languages exist since more than 50 years ago, LISP is one of them, but they have never been seriously adopted outside the academia.

Now, the computer science world is gradually moving toward functional programming.

There is a lot of hype surrounding Erlang, a functional language originally developed by Ericsson, and in 2007 Microsoft presented F#, which is a multi-paradigm language targeting .Net and largely based on OCaml.

In this blog I'll write down my progresses in learning this language and I hope that you can profit from my experience.

I'll come back to you soon with the first article, if you want in the meanwhile you can start from the links on the right side of the page.

Bye!