Claudio Cherubino's blog Life of a Googler

1Dec/090

A new beginning (and a new WordPress plugin)

It is never easy to write the first post of a blog, fortunately for me it is already the third time and actually the second for this very blog.
After three years of my first encounter with the blogging world I decided to move a step forward and reboot, but with much more experience by my side.

In this reborn blog I will focus on my passion, that is software development, and I want to start presenting the WordPress plugin that I wrote for this event.
It is called Front Page Exclude By Date and is a very simple plugin that I needed to hide my old posts from appearing on the front page without deleting them and losing all links pointing to them.

You can still read the old posts by searching for them or clicking on your bookmarks and now you can also find on this blog the series of posts on functional programming and the F# language taken from the FSharp.it website, which is no longer updated and merged with this blog.

In the next days I'll talk about my Ruby client library for Project Voldemort, stay connected and follow me on Twitter!

26Aug/091

Functional programming interview question

I think that examining the hiring process of a company you can understand a lot of what would be working there.

As Joel Spolsky wrote, you should only hire people who are Smart and Get Things Done and a good way to be sure that a candidate belongs to this category is testing his/her skills with a good programming exercise, one easy enough to be solved in 15 minutes but that requires the use of brain.

Starling Software clearly describes its interview process on a page of its website and proposes a couple of sample programs for the potential applicants. In the first problem you are asked to use Haskell to process a given file (obviously we will use F#):

The file input.txt contains lists of words, one per line, in two categories, NUMBERS and ANIMALS. A line containing just a category name indicates that the words on the following lines, until the next category name, belong to that category. Read this file as input (on stdin) and print out a) a sorted list of the unique animal names encountered, and b) a list of the number words encountered, along with the count of each. Feel free to chose your output format.

The algorithm to solve the exercise is easy: we read the file line by line and remember the current category (NUMBERS or ANIMALS) in order to add the next words to the appropriate list. When the file is over, we filter duplicates and sort the list of animals and group the numbers together with their counts.

The only problem is knowing how to manage the concept of state of the application in a functional way. In the imperative paradigm you define a variable to keep the state and change its value when you find a new category in the input file. In functional programming you don't use state variables instead you use function parameters and recursive calls:

open System
open System.IO

let animals_and_number filename =
  let rec process_line lines category animals numbers =
    match lines with
    | [] -> (animals, numbers)
    | x::xs -> match x with
               | "NUMBERS" -> process_line xs "NUMBERS" animals numbers
               | "ANIMALS" -> process_line xs "ANIMALS" animals numbers
               | x -> match category with
                      | "NUMBERS" -> process_line xs category animals (x :: numbers)
                      | "ANIMALS" -> process_line xs category (x :: animals) numbers
                      | _ -> process_line xs category animals numbers
  let all_lines = File.ReadAllLines(filename) |> Seq.to_list
  process_line all_lines "" [] []

let filename = "input.txt"
let (animals, numbers) = animals_and_number filename
let sorted_animals = animals |> Seq.distinct |> Seq.sort |> Seq.to_list
let counted_words = numbers |> Seq.countBy (fun x -> x) |> Seq.to_list

printf "Animals: %A\n" sorted_animals
printf "Numbers: %A" counted_words

The recursive process_line function has four parameters: the list of lines to be processed, the current category (initially an empty string) and the two lists of animals and numbers found so far.

For each new line we first check if it represents one of the categories. In this case we have to change state, i.e. discard the element and recursively call the same function with the correct category parameter.

If the element processed is not a category we only have to add it to the animals or number list, according to the value of the category parameter.

At the end of the animals_and_number function (when lines is empty) we return a tuple made of the two lists created.
The rest of the job is calling some standard library functions to filter duplicates, sort and count the elements of the sequences.

29Jan/080

Project Euler in F# – Problem 20

Project Euler's Problem 20 was trickier than I expected.

I'm quite sure that there is a smart solution, but there is also a "dumb" one and this is the one I'm going to present you.

The problem says:

n! means n × (n ? 1) × ... × 3 × 2 × 1

Find the sum of the digits in the number 100!

Evaluating the factorial is straightforward in any functional language, but it is even easier in F# since the language already provides us with the factorial function.

As usual, let's show the code and then analyze it:

#light
open Microsoft.FSharp.Math.BigInt

let rec digits n =
    match n with
    | n when n < 10I -> n
    | n ->
        let x, y = divmod n 10I
        y + digits x

let answer = digits (factorial 100I)

The open directive at line 2 permits the use of the BigInt namespace, which is needed when working with very large numbers.

Then we have the code to sum the digits of a number, which is easily implemented with pattern matching.

The algorithm is based on the modulus operator, since taking the last digit of a number is equivalent to taking the remainder of the number itself divided by 10 (in a base 10 system).

Hence, we take the last digit and recursively apply the same function to the rest of the number, until there is only one digit left, i.e. the number is less than 10.

This operation is done by the divmod function, which returns the integer quotient and the remainder as a couple (x, y).

You may have noticed that there is a capital "I" after each number. That letter stands for Big Integer and is used to distinguish them from normal integers.

It's done, we just have to apply the digits function to the output of factorial 100I.

Why did I call this solution "dumb"?

Because I had to compute 100! (which is a number with 158 digits) and then sum its digits, but this solution is not feasible if the number becomes much larger.

I think that there should be some mathematical trick to do the same job, but I don't know it.

Does anybody have a different approach to suggest?

25Jan/080

Project Euler in F# – Problem 9

Today's exercise is Project Euler Problem 9, that says:

A Pythagorean triplet is a set of three natural numbers, a < b < c, for which,
a² + b² = c²

For example, 3² + 4² = 9 + 16 = 25 = 5².

There exists exactly one Pythagorean triplet for which a + b + c = 1000.
Find the product abc.

Therefore this time we have to find a triplet (a, b, c), but a brute-force approach would require a billion combinations, since the three numbers can range from 1 to 1000.

However, we can reduce the number of tests exploiting a little algebra.

First of all, c = 1000 - a - b, so only two variables are in play, leading us to a million choices.

Then we go down to half a million, thanks to the constraint that a < b.

According to these premises, here is the F# code:

#light
let sq x = x * x

let ab =
    { for a in 1 .. 1000
      for b in a .. 1000 -> a,b } |> Seq.filter (fun (a,b) -> sq a + sq b = sq (1000 - a - b) ) |> Seq.hd

let answer =
    let a = fst ab
    let b = snd ab
    let c = 1000 - a - b
    a * b * c

At line 5 we enumerate all items of a collection created with list comprehension, i.e. the numbers between 1 and 1000.

Inside this loop we have another loop, but in this case we are interested in all numbers bigger than the current item of the outer collection, i.e. all numbers between a and 1000.

Then, we filter all couples (a, b) keeping only those for which we have: a² + b² = (1000 - a - b)².

According to the text of the exercise, there exists exactly one Pythagorean triplet that satisfies the equation a + b + c = 1000, so we can apply the Seq.hd (head) function to take the first (and only) element of the generated sequence.

At lines 9, 10 and 11 we evaluate a, b and c and finally we multiply them to get the answer.

18Jan/080

Project Euler in F# – Problem 1

I'm going through two parallel roads to learn F#: I read the "Foundations of F#" book written by Robert Pickering for the theoretical aspects and then I practice writing some code challenging myself with the problems proposed by Project Euler.

There are currently 177 mathematical exercises that can be ordered by difficulty, so I think that as soon as I solved them I'll master all aspects of F#.

Maybe my solutions are not the best at all, but I'll show them to you and I'll try to use them to explain F# programming while I'm still learning it.

Feel free to ask questions, comment or propose alternative solutions, I'll be glad to compare mine with yours.

So, let's start with the first exercise, which says:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

This is a very good introductory exercise and can be used to understand the power of the functional paradigm.

Here is the first version of my code, the explanation will be just below:

#light
let rec sum_mul xs =
  match xs with
  | []    -> 0
  | y::ys when y % 3 = 0 || y % 5 = 0 -> y + sum_mul ys
  | y::ys -> sum_mul ys

let sum = sum_mul [1 .. 999]
print_any sum

The #light clause at line 1 allows us to write code with a simpler syntax, without all the symbols and keywords, such as begin and end.

At line 2 we have the definition of a recursive function called sum_mul that accepts an argument called xs.

To define functions and literals (we don't call them variable, since their value never changes) you need to use the let keyword.

Then, in lines 3-6 we have an example of pattern matching, that is a common practice in functional programming similar to the switch statement of the imperative paradigm.

It compares xs with some patterns, which are written one per line, with a | (pipe) in front of each of them.
The first case is matched when xs is an empty list, and in that case sum_mul simply returns 0.

In the other cases xs is matched against a list, with the first element (the head of the list) being called y and the rest of the list (the tail) called ys.

If y is evenly divided by 3 or 5 (using the modulus operator: %) we add the number to our result and then recursively apply the same function to the tail.
Otherwise we just skip the head and call sum_mul on the tail.

At line 8 we actually call the sum_mul function passing a list as the argument. This list is composed by all integers between 1 and 999.

Line 9 is just used to print the output to screen.

If you want to run this code you can use Visual Studio with the F# plugin installed or simply run the F# interpreter, which is a command line tool called fsi.exe and is installed with the F# package.

I hope the explanation was clear and you started to appreciate the power of functional programming.
In the next article I'll show you an alternative (and much faster) solution to the same problem.

15Jan/080

let title = "Hello World"

When I was attending the first year of my university course, a teacher of mine used Haskell to teach us the basics of software development.

It was amazing, functional programming makes you think differently about programming.

In the functional paradigm functions are used in their real mathematical sense.
Hence, they are only computation objects and there is no information about state or mutable data.

Functional programming languages exist since more than 50 years ago, LISP is one of them, but they have never been seriously adopted outside the academia.

Now, the computer science world is gradually moving toward functional programming.

There is a lot of hype surrounding Erlang, a functional language originally developed by Ericsson, and in 2007 Microsoft presented F#, which is a multi-paradigm language targeting .Net and largely based on OCaml.

In this blog I'll write down my progresses in learning this language and I hope that you can profit from my experience.

I'll come back to you soon with the first article, if you want in the meanwhile you can start from the links on the right side of the page.

Bye!