Claudio Cherubino's blog Life of a Googler

13Jun/0911

Merging arrays

Thanks to interviewpattern.com I discovered that one of the classical Amazon interview questions is writing a snippet of code to merge two sorted arrays:

"Suppose we have two sorted arrays A[] of m elements and B[] of n elements. Write a function merge which would merge this two arrays into new sorted array C[] in O(n) time as shown on the picture".

MergeArrays

This problem is also a classical exercise for functional programming learners that shows the conciseness of functional code in comparison with imperative one.

The solution presented on the original page is written in C# and is longer than 40 lines of code, while we can solve the same problem in F# with less than 10 lines:

let rec merge_arrays a b =
  match (a, b) with
  | (a, []) -> a
  | ([], b) -> b
  | (x::xs, y::ys) -> if (x < y) then
                        (x :: (merge_arrays xs (y::ys)))
                      else
                        (y :: (merge_arrays (x::xs) ys))

Besides being shorter, I also find the functional code to be much easier to understand. Do you agree with me?

Comments (11) Trackbacks (0)
  1. While not specifically stated in the problem statement, I’d expect that the result should be an array and the inputs are stated to be arrays, which adds a couple of lines to the solution:

    let merge_arrays a b =
        let rec loop a' b' =
            match (a', b') with
            | (a', []) -> a'
            | ([], b') -> b'
            | (x::xs, y::ys) -> if (x  Array.of_list
    
  2. nice solution for haskell(because of its laziness) but not so good for F# as it is not tail recursive and would have stack overflow error when the list is up to certain size.

    Either it needs to be transformed to a tail call format or use the Lazy library or the ‘yield’ construct.

  3. @James: arrays can be used instead of lists, only a few changes are required. Part of your code is missing, can you resend it?

    @gary: a tail recursive version follows, as usual I find the original version easier to understand and therefore better suited for a didactic post.

    let tail_merge_arrays a b =
      let rec merge c d acc =
        match (c, d) with
        | ([], []) -> acc
        | (c, []) -> acc @ c
        | ([], d) -> acc @ d
        | (x::xs, y::ys) ->  if (x < y) then
                                merge xs (y::ys) (acc @ [x])
                             else
                                merge (x::xs) ys (acc @ [y])
      merge a b []
    
  4. that is the problem of FP. that nice and simple to reason solution has severe real world limitation.

    I would suggest do a reverse(may be double reverse) and stick to ‘::’, as ‘@’ is a very expensive operation.

  5. Here’s an example that works on arrays, does not create a ton of garbage (space efficient), is tail-recursive, and still just 15 lines. (Now if I can just make it appear with the proper formatting.)

    let MergeSortedArrays (a:array<_>) (b:array<_>) =
        let al, bl = a.Length, b.Length
        let r = Array.zeroCreate (al+bl)
        let ri = ref 0
        let inline Yield x = r.[!ri] <- x; incr ri
        let rec Merge ai bi =
            if not(ai >= al && bi >= bl) then
                if bi >= bl || ai < al && a.[ai] < b.[bi] then
                    Yield a.[ai]
                    Merge (ai+1) bi
                else
                    Yield b.[bi]
                    Merge ai (bi+1)
        Merge 0 0
        r
    
    // some tests
    printfn "%A" ([|1..6|] = MergeSortedArrays [|1..4|] [|5;6|])
    printfn "%A" ([|1..6|] = MergeSortedArrays [|5;6|] [|1..4|])
    
    let N = 40000
    let a1 = Array.init N (fun x -> 2 * x + 1)  // 1, 3, 5, ...
    let a2 = Array.init N (fun x -> 2 * x)  // 0, 2, 4, ...
    let r = MergeSortedArrays a1 a2
    let expected = [| 0..2*N-1 |]
    printfn "%A" (r = expected)
    
  6. In other words, ‘@’ is not O(n)… Not that we’re trying to pick this apart, mind you :-) [sorry]

  7. @Gary
    “that is the problem of FP.”

    Quite the contrary. This is a problem of non-FP. Specifically, the lack of controlled side-effects introduces the need for strict evaluation, which is the problem. If F# were more functional, then the problem of evaluation could be eschewed more easily.

  8. This code uses active patterns to approximate the list based implementation and is tail recursive:

    let merge_array (a:'a array) b =
    ....let (|Empty|Head_Rest|) (a:'a array,ai) =
    ........if ai <a> b
    ........| _ ->
    ............System.Array.Copy(a,ai,b,bi,a.Length - ai)
    ............b
    ....let append a (c:'a array,ci) =
    ........c.[ci]   copyToEnd b c
    ........| a,Empty -> copyToEnd a c
    ........| Head_Rest(ah,ar),Head_Rest(bh,br) ->
    ............if ah  append ah |> merge ar b
    ............else
    ................c |> append bh |> merge a br
    ....merge (a,0) (b,0) ((Array.create (a.Length+b.Length) Microsoft.FSharp.Core.Operators.Unchecked.defaultof),0)
    

    how do you post code nicely here?

  9. To format F# code nicely you just have to wrap it between ‘[ fsharp]‘ and ‘[ /fsharp]‘ (without quotes and spaces after the opening bracket).

    I added the tags in your comments for you.

  10. This is based on claudio’s second verion but not using ‘@’. It does unfortunately create lots of intermediate garbage(characteristic of FP, the original Haskell friendly solution is very time and space efficient in Haskell because of the laziness) and cheat a bit using the List.rev rather than writing my own(which would add about 10 lines of code) and because of the use of ‘reverse’, I am not sure if it fits the O(n) requirement. The input as array IMO is an implementation detail and should not be a requirement as for imperative language, array is usually the fundamental construct whereas in things like F# or Haskell, list is.

    let ms (a1:'a[]) (a2:'a[]) =
     let rec _ms l1 l2 acc =
      match (l1,l2) with
       | ([],[]) -> acc
       | (x::[],[]) -> x::acc
       | (x::xs,[]) -> _ms xs [] (x::acc)
       | ([],y::[]) -> y::acc
       | ([],y::ys) -> _ms [] ys (y::acc)
       | (x::xs,y::ys) ->
         if x < y then
          _ms xs l2 (x::acc)
         else
          _ms l1 ys (y::acc)
    
     List.to_array (List.rev (_ms (Array.to_list a1) (Array.to_list a2) []))
    
  11. Here’s my solution using LazyLists. It has the virtue of taking anything that’s a sequence (which means anything that implements IEnumerable, so arrays, lists, etc all work out of the box.

    type MergeWithLazyLists() =
      static member merge(f, x: LazyList, y: LazyList) =
        seq {
          match x, y with
          | LazyList.Cons(xh, xt), LazyList.Cons(yh, yt) when f xh yh ->
            yield xh
            yield! MergeWithLazyLists.merge(f, xt, y)
          | LazyList.Cons(xh, xt), LazyList.Cons(yh, yt) ->
            yield yh
            yield! MergeWithLazyLists.merge(f, x, yt)
          | LazyList.Nil, LazyList.Cons(_, _) -> yield! y
          | LazyList.Cons(_, _), LazyList.Nil -> yield! x
          | LazyList.Nil, LazyList.Nil -> ()
        }
    
      []
      static member merge(f, x: seq, y: seq) =
        MergeWithLazyLists.merge(f, (LazyList.of_seq x), (LazyList.of_seq y))
    
    let result = MergeWithLazyLists.merge((fun x y -> x < y), (seq {2..6}), (seq {1..7..15}))
    
    printfn "%A" (Seq.to_array result)
    // [|1; 2; 3; 4; 5; 6; 8; 15|]
    

Leave a comment

(required)

No trackbacks yet.