Remove duplicate values from a List in F#
Once in a while I give a look at the web server log to find out how people landed in this blog and yesterday there was someone that entered the following search string into Google:
Given a random List of integers, write a function that removes all the duplicate values
This site appears as the first result, even if there is no solution for that problem, but I want to help you anyway, unknown visitor!
In fact, we can make use of the Set data type to easily solve the problem.
According to the definition, a Set represents a collection of elements without duplicates.
So, if we create a Set from the given list, all duplicate values will be automatically filtered.
This can be done in a single line, I’ll call the function nub because this is the name of the corresponding Haskell one:
let nub xs = xs |> Set.of_list |> Set.to_list
As I said, to return a List without duplicate values we just need to create a Set from it and then create a List from the new Set.
Do you think it is fair to act this way or should I say that I’m cheating?
claudio on April 23rd 2008 in Functional programming
Akiva responded on 24 Apr 2008 at 9:29 pm #
It’s precisely this kind of thing that makes me love F# more and more each day.
claudio responded on 24 Apr 2008 at 9:33 pm #
I totally agree with you…
anocka responded on 27 Apr 2008 at 2:40 pm #
This may seem elegant, but I’m afraid this would be to slow on large lists. Another elegant solution uses the fold operator.
First define a function that adds an element to a list if it does not already belong to it :
let add_elem elem list =
if List.mem elem list then list else elem :: list
Then you use it to accumulate on the list to build one without duplicate values :
let nub list = List.fold_right add_elem list []
Fold power !
PS : Actually, that’s ocaml code, but I think it should work on F#.
anocka responded on 27 Apr 2008 at 3:04 pm #
Well, I did not think it through : you algorithm has a O(n*log(n)) complexity while mine is O(n²)…
Daniel Furrer responded on 30 Jun 2008 at 12:40 pm #
Leaving performance aside, the version that anocka proposed can be adapted to make sure elements in the list do not change order, which is important in some cases.
Here’s what I am using:
let remove_duplicates list =
let add_elem acc_list elem =
if List.mem elem acc_list then
acc_list
else
acc_list @ [elem]
List.fold_left add_elem [] list
OJ responded on 18 Jul 2008 at 7:12 am #
You could make it shorter:
let nub = Set.of_list >> Set.to_list
Jon Harrop responded on 17 Jul 2009 at 8:31 am #
I think anocka’s comment is wrong. Your algorithm is O(n log n) whereas his is O(n^2).
Also, you can write:
let nub = set >> List.of_seq
FSharp.it » Another interview question responded on 26 Aug 2009 at 2:24 pm #
[...] made of the two lists created. The rest of the job is calling some standard library functions to filter duplicates, sort and count the elements of the [...]