Remove duplicate values from a List in F#

Once in a while I give a look at the web server log to find out how people landed in this blog and yesterday there was someone that entered the following search string into Google:

Given a random List of integers, write a function that removes all the duplicate values

This site appears as the first result, even if there is no solution for that problem, but I want to help you anyway, unknown visitor!

In fact, we can make use of the Set data type to easily solve the problem.

According to the definition, a Set represents a collection of elements without duplicates.
So, if we create a Set from the given list, all duplicate values will be automatically filtered.

This can be done in a single line, I’ll call the function nub because this is the name of the corresponding Haskell one:

let nub xs = xs |> Set.of_list |> Set.to_list

As I said, to return a List without duplicate values we just need to create a Set from it and then create a List from the new Set.

Do you think it is fair to act this way or should I say that I’m cheating? :)

8 Comments »

claudio on April 23rd 2008 in Functional programming

8 Responses to “Remove duplicate values from a List in F#”

  1. Akiva responded on 24 Apr 2008 at 9:29 pm #

    It’s precisely this kind of thing that makes me love F# more and more each day.

  2. claudio responded on 24 Apr 2008 at 9:33 pm #

    I totally agree with you…

  3. anocka responded on 27 Apr 2008 at 2:40 pm #

    This may seem elegant, but I’m afraid this would be to slow on large lists. Another elegant solution uses the fold operator.
    First define a function that adds an element to a list if it does not already belong to it :

    let add_elem elem list =
    if List.mem elem list then list else elem :: list

    Then you use it to accumulate on the list to build one without duplicate values :
    let nub list = List.fold_right add_elem list []

    Fold power !
    PS : Actually, that’s ocaml code, but I think it should work on F#.

  4. anocka responded on 27 Apr 2008 at 3:04 pm #

    Well, I did not think it through : you algorithm has a O(n*log(n)) complexity while mine is O(n²)…

  5. Daniel Furrer responded on 30 Jun 2008 at 12:40 pm #

    Leaving performance aside, the version that anocka proposed can be adapted to make sure elements in the list do not change order, which is important in some cases.

    Here’s what I am using:

    let remove_duplicates list =
    let add_elem acc_list elem =
    if List.mem elem acc_list then
    acc_list
    else
    acc_list @ [elem]
    List.fold_left add_elem [] list

  6. OJ responded on 18 Jul 2008 at 7:12 am #

    You could make it shorter:

    let nub = Set.of_list >> Set.to_list

    :)

  7. Jon Harrop responded on 17 Jul 2009 at 8:31 am #

    I think anocka’s comment is wrong. Your algorithm is O(n log n) whereas his is O(n^2).

    Also, you can write:

    let nub = set >> List.of_seq

  8. FSharp.it » Another interview question responded on 26 Aug 2009 at 2:24 pm #

    [...] made of the two lists created. The rest of the job is calling some standard library functions to filter duplicates, sort and count the elements of the [...]

Trackback URI | Comments RSS

Leave a Reply