Claudio Cherubino's blog Life of a Googler

23Apr/087

Remove duplicate values from a List in F#

Once in a while I give a look at the web server log to find out how people landed in this blog and yesterday there was someone that entered the following search string into Google:

Given a random List of integers, write a function that removes all the duplicate values

This site appears as the first result, even if there is no solution for that problem, but I want to help you anyway, unknown visitor!

In fact, we can make use of the Set data type to easily solve the problem.

According to the definition, a Set represents a collection of elements without duplicates.
So, if we create a Set from the given list, all duplicate values will be automatically filtered.

This can be done in a single line, I'll call the function nub because this is the name of the corresponding Haskell one:

let nub xs = xs |> Set.of_list |> Set.to_list

As I said, to return a List without duplicate values we just need to create a Set from it and then create a List from the new Set.

Do you think it is fair to act this way or should I say that I'm cheating? :)

Comments (7) Trackbacks (1)
  1. It’s precisely this kind of thing that makes me love F# more and more each day.

  2. I totally agree with you…

  3. This may seem elegant, but I’m afraid this would be to slow on large lists. Another elegant solution uses the fold operator.
    First define a function that adds an element to a list if it does not already belong to it :

    let add_elem elem list =
    if List.mem elem list then list else elem :: list

    Then you use it to accumulate on the list to build one without duplicate values :
    let nub list = List.fold_right add_elem list []

    Fold power !
    PS : Actually, that’s ocaml code, but I think it should work on F#.

  4. Well, I did not think it through : you algorithm has a O(n*log(n)) complexity while mine is O(n²)…

  5. Leaving performance aside, the version that anocka proposed can be adapted to make sure elements in the list do not change order, which is important in some cases.

    Here’s what I am using:

    let remove_duplicates list =
    let add_elem acc_list elem =
    if List.mem elem acc_list then
    acc_list
    else
    acc_list @ [elem]
    List.fold_left add_elem [] list

  6. You could make it shorter:

    let nub = Set.of_list >> Set.to_list
    :)

  7. I think anocka’s comment is wrong. Your algorithm is O(n log n) whereas his is O(n^2).

    Also, you can write:

    let nub = set >> List.of_seq


Leave a comment

(required)