Merging arrays
Thanks to interviewpattern.com I discovered that one of the classical Amazon interview questions is writing a snippet of code to merge two sorted arrays:
"Suppose we have two sorted arrays A[] of m elements and B[] of n elements. Write a function merge which would merge this two arrays into new sorted array C[] in O(n) time as shown on the picture".

This problem is also a classical exercise for functional programming learners that shows the conciseness of functional code in comparison with imperative one.
The solution presented on the original page is written in C# and is longer than 40 lines of code, while we can solve the same problem in F# with less than 10 lines:
let rec merge_arrays a b =
match (a, b) with
| (a, []) -> a
| ([], b) -> b
| (x::xs, y::ys) -> if (x < y) then
(x :: (merge_arrays xs (y::ys)))
else
(y :: (merge_arrays (x::xs) ys))
Besides being shorter, I also find the functional code to be much easier to understand. Do you agree with me?



June 13th, 2009 - 17:40
While not specifically stated in the problem statement, I’d expect that the result should be an array and the inputs are stated to be arrays, which adds a couple of lines to the solution:
June 13th, 2009 - 18:15
nice solution for haskell(because of its laziness) but not so good for F# as it is not tail recursive and would have stack overflow error when the list is up to certain size.
Either it needs to be transformed to a tail call format or use the Lazy library or the ‘yield’ construct.
June 13th, 2009 - 18:42
@James: arrays can be used instead of lists, only a few changes are required. Part of your code is missing, can you resend it?
@gary: a tail recursive version follows, as usual I find the original version easier to understand and therefore better suited for a didactic post.
let tail_merge_arrays a b = let rec merge c d acc = match (c, d) with | ([], []) -> acc | (c, []) -> acc @ c | ([], d) -> acc @ d | (x::xs, y::ys) -> if (x < y) then merge xs (y::ys) (acc @ [x]) else merge (x::xs) ys (acc @ [y]) merge a b []June 13th, 2009 - 19:16
that is the problem of FP. that nice and simple to reason solution has severe real world limitation.
I would suggest do a reverse(may be double reverse) and stick to ‘::’, as ‘@’ is a very expensive operation.
June 13th, 2009 - 21:01
Here’s an example that works on arrays, does not create a ton of garbage (space efficient), is tail-recursive, and still just 15 lines. (Now if I can just make it appear with the proper formatting.)
let MergeSortedArrays (a:array<_>) (b:array<_>) = let al, bl = a.Length, b.Length let r = Array.zeroCreate (al+bl) let ri = ref 0 let inline Yield x = r.[!ri] <- x; incr ri let rec Merge ai bi = if not(ai >= al && bi >= bl) then if bi >= bl || ai < al && a.[ai] < b.[bi] then Yield a.[ai] Merge (ai+1) bi else Yield b.[bi] Merge ai (bi+1) Merge 0 0 r // some tests printfn "%A" ([|1..6|] = MergeSortedArrays [|1..4|] [|5;6|]) printfn "%A" ([|1..6|] = MergeSortedArrays [|5;6|] [|1..4|]) let N = 40000 let a1 = Array.init N (fun x -> 2 * x + 1) // 1, 3, 5, ... let a2 = Array.init N (fun x -> 2 * x) // 0, 2, 4, ... let r = MergeSortedArrays a1 a2 let expected = [| 0..2*N-1 |] printfn "%A" (r = expected)June 13th, 2009 - 22:56
In other words, ‘@’ is not O(n)… Not that we’re trying to pick this apart, mind you
[sorry]
June 14th, 2009 - 02:43
@Gary
“that is the problem of FP.”
Quite the contrary. This is a problem of non-FP. Specifically, the lack of controlled side-effects introduces the need for strict evaluation, which is the problem. If F# were more functional, then the problem of evaluation could be eschewed more easily.
June 14th, 2009 - 03:05
This code uses active patterns to approximate the list based implementation and is tail recursive:
how do you post code nicely here?
June 14th, 2009 - 15:36
To format F# code nicely you just have to wrap it between ‘[ fsharp]‘ and ‘[ /fsharp]‘ (without quotes and spaces after the opening bracket).
I added the tags in your comments for you.
June 14th, 2009 - 18:03
This is based on claudio’s second verion but not using ‘@’. It does unfortunately create lots of intermediate garbage(characteristic of FP, the original Haskell friendly solution is very time and space efficient in Haskell because of the laziness) and cheat a bit using the List.rev rather than writing my own(which would add about 10 lines of code) and because of the use of ‘reverse’, I am not sure if it fits the O(n) requirement. The input as array IMO is an implementation detail and should not be a requirement as for imperative language, array is usually the fundamental construct whereas in things like F# or Haskell, list is.
let ms (a1:'a[]) (a2:'a[]) = let rec _ms l1 l2 acc = match (l1,l2) with | ([],[]) -> acc | (x::[],[]) -> x::acc | (x::xs,[]) -> _ms xs [] (x::acc) | ([],y::[]) -> y::acc | ([],y::ys) -> _ms [] ys (y::acc) | (x::xs,y::ys) -> if x < y then _ms xs l2 (x::acc) else _ms l1 ys (y::acc) List.to_array (List.rev (_ms (Array.to_list a1) (Array.to_list a2) []))June 16th, 2009 - 05:12
Here’s my solution using LazyLists. It has the virtue of taking anything that’s a sequence (which means anything that implements IEnumerable, so arrays, lists, etc all work out of the box.
type MergeWithLazyLists() = static member merge(f, x: LazyList, y: LazyList) = seq { match x, y with | LazyList.Cons(xh, xt), LazyList.Cons(yh, yt) when f xh yh -> yield xh yield! MergeWithLazyLists.merge(f, xt, y) | LazyList.Cons(xh, xt), LazyList.Cons(yh, yt) -> yield yh yield! MergeWithLazyLists.merge(f, x, yt) | LazyList.Nil, LazyList.Cons(_, _) -> yield! y | LazyList.Cons(_, _), LazyList.Nil -> yield! x | LazyList.Nil, LazyList.Nil -> () } [] static member merge(f, x: seq, y: seq) = MergeWithLazyLists.merge(f, (LazyList.of_seq x), (LazyList.of_seq y)) let result = MergeWithLazyLists.merge((fun x y -> x < y), (seq {2..6}), (seq {1..7..15})) printfn "%A" (Seq.to_array result) // [|1; 2; 3; 4; 5; 6; 8; 15|]