Balanced parenthesis
Checking whether a series of parenthesis is balanced and valid is a classic puzzle in computer science and can be easily met during a job interview.
In my opinion, there is no better solution than using a regular expression and solve the problem with a single (but almost unreadable) line of code, however I'd like to describe an algorithm which can be easily implemented in any functional or imperative programming language.
The idea is simple: start from the first character and check if it is an opening parenthesis. In this case we have to push it on a LIFO data structure such as a stack and continue with the next character.
If the character is a closing parenthesis, we take (pop) the top element from the stack and check if it makes a valid couple of brackets with the examined element.
If this is not the case, the series is not balanced, otherwise we can discard both parenthesis and go on with the next character, until the series is over.
When there are no parenthesis left to examine, if the stack is empty our series is valid; if there are still elements on the stack we can conclude that the series is unbalanced.
Lets give a look at the F# implementation I wrote to answer Dev102's challenge:
#light
let brackets = [('(',')'); ('[',']'); ('{','}'); ('<','>')]
let is_opening bracket = List.exists ( fun (a,b) -> a = bracket) brackets
let is_closing bracket = List.exists ( fun (a,b) -> b = bracket) brackets
let is_valid (opening, closing) = List.exists (fun (a,b) -> (a,b) = (opening, closing)) brackets
let rec check_valid stack char_array =
match char_array with
| [] -> stack = List.Empty
| c::cs when is_opening c -> check_valid (c::stack) cs
| c::cs ->
match stack with
| [] -> false
| x::xs when is_valid (x, c) -> check_valid xs cs
| x::xs -> false
At the beginning I defined the list of valid couples and three functions to check respectively if a char is an opening or closing bracket or if a couple is valid.
After that there is a function implementing the algorithm explained above. The stack is just a normal list, where items are always added and removed from the front.
The missing number
There is an interesting series of programming job interview challenges proposed by Dev102.com, which is now at its tenth puzzle:
This week question is pretty easy. Your input is an unsorted list of n numbers ranging from 1 to n+1, all of the numbers are unique, meaning that a number can’t appear twice in that list. According to those rules, one of the numbers is missing and you are asked to provide the most efficient method to find that missing number. Notice that you can’t allocate another helper list, the amount of memory that you are allowed to allocate is O(1). Don’t forget to mention the complexity of your algorithm…
I'm not sure I understood correctly the constraint related to the memory allocation.
In my opinion, when they say we are limited to O(1), they mean that we can only allocate a single numeric variable and not any other data structure.
According to this interpretation, the solution is quite easy.
First of all, we take our only variable and store into it the sum of all the numbers between 1 and n + 1, which can be easily computed remembering that 1 + 2 + ... + n = n (n + 1) / 2.
Then, we subtract each element of the array from this value, eventually the result is actually our missing number.
The functional implementation of this imperative algorithm is straightforward:
#light let sum n = ((n + 1) * (n + 2)) / 2 let answer nums = (nums |> List.length |> sum) - (List.reduce_left (+) nums)
Let's talk about the complexity of the algorithm.
If the length of the input list list is known, evaluating the sum of the first n natural numbers takes O(1), otherwise we have to scan the entire list, i.e. O(n).
Then we have to subtract each element of the list, and this takes another iteration, so another O(n).
Therefore we have O(n) + O(n), which is definitely O(n), and can be easily optimized a little by scanning the list only once.
The only problem left is: "am I following the instructions"?
Google Interview Question: Product of other Elements in an Array in O(n)
Last time I was interviewed for a software development engineer position, the recruiter asked me some of the classical Microsoft interview questions, such as "How Would You Move Mount Fuji?" or "How many gas station are there in your country?".
It was the first time for me to be asked such questions but having obtained the job I think my answers were good enough.
After that day, I looked for other well-known interview questions and I discovered that Google has a totally different approach, focusing on algorithms, data structures and complexity.
For instance, one of Google interview questions says:
There is an array A[N] of N integers. You have to compose an array Output[N+1] such that Output[i] will be equal to the product of all the elements of A[] except A[i].
Example:
INPUT:[4, 3, 2, 1, 2]
OUTPUT:[12, 16, 24, 48, 24]Solve it without division operator and in O(n).
Without the two constraints at the end of the puzzle it would have been straightforward, but now we have to be smart.
Actually, the product of all elements of A[] except A[i] is equal to the product of all elements before A[i] and those after A[i].
We can traverse A[] twice, once from left to right and once in the opposite way and multiply all the elements we find before A[i].
We'll pretend to have a new array called Output[] to store the output of the first pass, assigning Output[i] the product of all elements preceding A[i]:
let rec firstpass product input =
match input with
| [] -> []
| x::xs -> product :: firstpass (product * x) xs
For the second pass we need to move from right to left, but this can be done by reversing the input arrays and moving as usual:
let secondpass product input arr =
let rev_input = List.rev input
let rev_arr = List.rev arr
let rec rev_secondpass product (input:list<int>) arr =
match arr with
| [] -> []
| x::xs -> rev_secondpass (product * input.Head) input.Tail xs @ [(x * product)]
rev_secondpass product rev_input rev_arr
Both firstpass and secondpass expect an integer argument called product, which will be always be 1 at the beginning and will be used to store the intermediate products during the recursive calls.
With these functions we can just define an input array and apply them to get the result.
The following is the complete F# code:
#light
let input = [ 4; 3; 2; 1; 2 ]
let answer values =
let rec firstpass product input =
match input with
| [] -> []
| x::xs -> product :: firstpass (product * x) xs
let secondpass product input arr =
let rev_input = List.rev input
let rev_arr = List.rev arr
let rec rev_secondpass product (input:list<int>) arr =
match arr with
| [] -> []
| x::xs -> rev_secondpass (product * input.Head) input.Tail xs @ [(x * product)]
rev_secondpass product rev_input rev_arr
values |> firstpass 1 |> secondpass 1 values
Google Treasure Hunt 2008, second puzzle in F#
Unlike the first puzzle, which required some maths knowledge, in the second one we have to prove to be able to recursively process the filesystem.
I think that this puzzle should be quite easy to solve for the system administrators and in general those used to scripting languages.
Here is the text of the second Google Treasure Hunt 2008 problem:
Here is a random zip archive for you to download:
GoogleTreasureHunt08_1767743498252291641.zipUnzip the archive, then process the resulting files to obtain a numeric result. You'll be taking the sum of lines from files matching a certain description, and multiplying those sums together to obtain a final result. Note that files have many different extensions, like '.pdf' and '.js', but all are plain text files containing a small number of lines of text.
Sum of line 4 for all files with path or name containing jkl and ending in .txt
Sum of line 1 for all files with path or name containing zzz and ending in .xml
Hint: If the requested line does not exist, do not increment the sum.Multiply all the above sums together and enter the product below.
As usual, the puzzle may seem challenging at a first glance, so it is important to break it into smaller (and simpler) pieces.
First of all, we have to get the list of files ending with a given extension and containing a specific substring in their name.
There is an overloaded version of the Directory.GetFiles() function which takes two parameters, the directory to be searched and a pattern to be found in the files.
Then we have to filter the results checking if the complete filenames (path + filename) contain the given substring.
This first sub-problem can be solved with the following F# code:
#light
open System.IO
let file_list dir pattern content =
let rec allFiles dir pattern =
seq
{ for file in Directory.GetFiles(dir, pattern) do
yield file
for subdir in Directory.GetDirectories(dir) do
for file in allFiles subdir pattern do
yield file }
let elems = allFiles dir pattern
let filt (str : string) = str.Contains(content)
Seq.filter filt elems
Now we have the list of all files that satisfy the requirements of the exercise.
The next step is to open them (they are plain text files, despite their extensions) and take the numeric value from the right line, if present.
Let's write a function to accomplish this task for a single file and then map it to the entire list:
let getvalue numline filename =
let lines = File.ReadAllLines(filename)
let linevalue (lines : string[]) numline =
let realindex = numline - 1
if lines.Length > realindex then
System.Int32.Parse(lines.[realindex])
else
0
linevalue lines numline
The File.ReadAllLines(filename) function returns an array of strings, one for each line of the source file.
Arrays in F# have indexes starting from 0, so we have to subtract 1 from the line number supplied by the user to get the real index.
We also have to be sure that the searched line exists. In this case we return its content, otherwise we return zero.
We have now all the elements we need to compute one of the numbers requested by the exercise, we should only supply the appropriate parameters in this way:
let treasurehunt2 dir pattern content numline = file_list dir pattern content |> Seq.map (getvalue numline) |> Seq.fold (+) 0
For instance, to solve the first line of my puzzle we have to run the following command, assuming that the archive was unzipped in C:\google:
treasurehunt2 "C:\google" "*.txt" "jkl" 5
I hope you don't need my help to run this command twice with different parameters and multiply the results to get your answer to the quiz!
Google Treasure Hunt 2008, first puzzle in F#
A couple of days ago Google launched a contest called Treasure Hunt 2008, which will be composed of 4 puzzles.
Only the first of these quizzes is already available at http://treasurehunt.appspot.com/, and the new ones will be posted once per week.
At the end of the four puzzles there will be some prizes, even if nothing has been revealed yet.
Here is the description of the first puzzle:
A robot is located at the top-left corner of a n x k grid.
The robot can only move either down or right at any point in time. The robot is trying to reach the bottom-right corner of the grid.
How many possible unique paths are there?
(Note: Answer must be an exact, decimal representation of the number. Image not to scale.)
If the number of rows is r and the number of columns c, the robot can move r-1 steps down and c-1 steps to the right to get to the bottom-right corner of the grid.
The number we want to find is the combination of r1 and c1, and this can be easily computed using the binomial coeficient:
![]()
Let's assume r1 = r-1 and c1 = c-1, and write some F# code to get the solution to our problem:
#light open Microsoft.FSharp.Math.BigInt let robot rows columns = let r1 = rows - 1I let c1 = columns - 1I (factorial (r1 + c1)) / ((factorial r1) * (factorial c1))
This first puzzle was very easy to solve, if you go to the Treasure Hunt website now, you'll find the next exercise.
I'll give it a try, if you are interested just drop a comment and we'll discuss it here.
Even Digits Multiple Of Nine
The mathematical exercise we are going to solve comes from mathschallenge.net and can be summarized in a single sentence:
Find the smallest multiple of nine containing only even digits.
We just need a function to check if a number contains only even digits and then pass to it the multiples of nine.
This algorithm can be translated into the following code:
#light
open System
let isEven n = (n % 2 = 0)
let rec allEvenDigits n =
match n with
| n when n < 10 -> isEven n
| n -> (isEven (n % 10)) && (allEvenDigits (Int32.div n 10))
let nums = 9 |> Seq.unfold (fun i -> Some (i, i + 9))
let answer = Seq.find allEvenDigits nums
The isEven function is just a "shortcut" for the modulus operator and it is used inside allEvenDigits, where it is applied to each digit of the given number.
Then, we have to generate the multiples of nine, but we don't know when to stop, so we need an infinite sequence, that can be produced by the Seq.unfold function.
At the end we use Seq.find, which takes a boolean function (allEvenDigits) and a sequence (nums), and returns the first element of the sequence for which the given function returns "true", i.e. the smallest multiple of nine containing only even digits, that is exactly what we had to find.
Decoding RLE in F#
Encoding techniques are useless if you don't have a way to restore the original message from the compressed data.
A couple of days ago, we saw how to implement the Run-Length Encoding (RLE) algorithm to encode an array of elements and now we are going to describe how to restore the original array.
To decode an array encoded with RLE we take the (length, element) couples, replicate each element by its length and then concatenate the output.
Let's see the F# implementation:
#light
let decode src =
let rec decode_block (num, elem) =
match num with
| 0 -> []
| _ -> elem :: decode_block (num-1, elem)
in List.concat (List.map decode_block src)
At line 8 we concatenate the output of the decode_block function applied to each couple of the src array.
Decode_block is a recursive inner function, which is very easy to understand. In fact, it just repeats elem the number of times indicated by num.
This is everything you need to know to encode and decode arrays (or anything else) by using the RLE algorithm. There are, of course, better compression techniques, if you want we can try to look at them in the future.
Run-Length Encoding in F#
According to Wikipedia:
Run-length encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. This is most useful on data that contains many such runs: for example, relatively simple graphic images such as icons, line drawings, and animations.
Run-length encoding is used in fax machines. It is relatively efficient because most faxed documents are mostly white space, with occasional interruptions of black.
We will implement now the RLE algorithm in F#, writing a function that will take a list as argument and return a new list of couples (N,E), where N is the number of duplicates of the element E.
Let's show the code:
#light
let pack src =
let rec pack2 src tmp out =
match src with
| [] ->
if (tmp = [])
then out
else (List.append out [tmp])
| h::t ->
if (tmp = [])
then (pack2 t [h] out)
else
if (h = (List.hd tmp))
then (pack2 t (h::tmp) out)
else (pack2 t [h] (List.append out [tmp]))
in (pack2 src [] [])
let encode src = List.map (fun b -> (List.length b, List.hd b)) (pack src)
let source = ['a'; 'a'; 'a'; 'a'; 'b'; 'c'; 'c'; 'a'; 'a'; 'd'; 'e'; 'e'; 'e'; 'e']
let answer = source |> encode
First of all we need to pack consecutive duplicates of list elements into sublists, and this is done by the pack function.
Inside pack we define an inner function, called pack2, which is not accessible from outside this block. This also allows us to transform each call like "pack src" to "pack2 src [] []", adding two more arguments to the call.
This inner function moves elements from src to tmp while they are duplicates, and when a different element appears, the list contained in tmp is appended to out.
This process continues until the source list is exhausted, then the out list contains the packed output.
If you want to follow this procedure step-by-step, in the following image you can see the content of src, tmp and out at the beginning of each iteration, using ['a'; 'a'; 'a'; 'a'; 'b'; 'c'; 'c'; 'a'; 'a'; 'd'; 'e'; 'e'; 'e'; 'e'] as input.

Now, the actual run-length encoding is trivial. We just have to pack the source list and then apply a function on each element that takes a list of duplicates and returns a couple (N,E), where N is the number of elements and E is the first element (they are all the same).
In our case the final result will be:
[(4, 'a'); (1, 'b'); (2, 'c'); (2, 'a'); (1, 'd'); (4, 'e')]
In the next article I'll present the decoding algorithm, even if I'm quite sure you can easily write it on your own.



