How do you check for duplicates in python?

In this article we will discuss different ways to check if a list contains any duplicate element or not.

Suppose we have a list of elements i.e.

listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test']

Now we want to check if this list contains any duplicate element or not. There are several ways to do this, but here we will discuss 3 ways and will also analyze there performance.

Check for duplicates in a list using Set & by comparing sizes

To check if a list contains any duplicate element follow the following steps,

  1. Add the contents of list in a set.
    • As set contains only unique elements, so no duplicates will be added to the set.
  2. Compare the size of set and list.
    • If size of list & set is equal then it means no duplicates in list.
    • If size of list & set are different then it means yes, there are duplicates in list.

We have created a function that follows the above Algo i.e.

def checkIfDuplicates_1(listOfElems):
    ''' Check if given list contains any duplicates '''
    if len(listOfElems) == len(set(listOfElems)):
        return False
    else:
        return True

Now let’s use this function to check if our list contains any duplicate or not i.e.

listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test']

result = checkIfDuplicates_1(listOfElems)

if result:
    print('Yes, list contains duplicates')
else:
    print('No duplicates found in list')    

Output

Yes, list contains duplicates

Complexity Analysis of this solution.

Advertisements

As we are creating a set from list, so complexity will be n(log(n)). Comparing size is a O(1) operation. So, complexity of this solution is n(log(n)).

Even in best scenario i.e. if list contains only duplicated element, still this solution’s complexity will be n(log(n)) because we are just adding all the elements from list to set.

Let’s look into an another better solution,

Check for duplicates in list using Set & looking for first duplicate

Instead of adding all list elements into set and then looking for duplicates. We can add elements one by one to list and while adding check if it is duplicated or not i.e.

def checkIfDuplicates_2(listOfElems):
    ''' Check if given list contains any duplicates '''    
    setOfElems = set()
    for elem in listOfElems:
        if elem in setOfElems:
            return True
        else:
            setOfElems.add(elem)         
    return False

Now let’s use this function to check if our list contains any duplicate or not i.e.

listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test']

result = checkIfDuplicates_2(listOfElems)

if result:
    print('Yes, list contains duplicates')
else:
    print('No duplicates found in list')    

Output

Yes, list contains duplicates

Complexity Analysis of this solution

In worst case we will add all elements of list in a set to find that our list doesn’t contain any duplicate. So, worst case complexity will be n(log(n)).
In best case, we will get to know about duplicate as soon as we encounter it during iteration. So, it will be much lesser than n(log(n)).

Let’s look into an another solution.

Check if list contains duplicates using list.count()

Python’s list class provides a method that returns the frequency count of a given element in the list,

list.count(element)

It returns the occurrence count of element in the list.

Let’s use this to to check for duplicates,

def checkIfDuplicates_3(listOfElems):
    ''' Check if given list contains any duplicates '''    
    for elem in listOfElems:
        if listOfElems.count(elem) > 1:
            return True
    return False

Here we are iterating over all the elements of list and check count of each element in the list. If count > 1 then it means this element has duplicate entries.

Now let’s use this function to check if our list contains any duplicate or not i.e.

listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test']

result = checkIfDuplicates_3(listOfElems)

if result:
    print('Yes, list contains duplicates')
else:
    print('No duplicates found in list')    

Output

Yes, list contains duplicates

Complexity Analysis of this solution
This is the most inefficient solution till now with complexity O(n^2)

Complete example is as follows,

def checkIfDuplicates_1(listOfElems):
    ''' Check if given list contains any duplicates '''
    if len(listOfElems) == len(set(listOfElems)):
        return False
    else:
        return True
 
def checkIfDuplicates_2(listOfElems):
    ''' Check if given list contains any duplicates '''    
    setOfElems = set()
    for elem in listOfElems:
        if elem in setOfElems:
            return True
        else:
            setOfElems.add(elem)         
    return False
 
def checkIfDuplicates_3(listOfElems):
    ''' Check if given list contains any duplicates '''    
    for elem in listOfElems:
        if listOfElems.count(elem) > 1:
            return True
    return False
 
def main():
 
    listOfElems = ['Hello', 'Ok', 'is', 'Ok', 'test', 'this', 'is', 'a', 'test']

    print('*** Check for duplicates in list using Set and comparing sizes ***')

    result = checkIfDuplicates_1(listOfElems)

    if result:
        print('Yes, list contains duplicates')
    else:
        print('No duplicates found in list')    
 
    print('*** Check for duplicates in list using Set and looking for first duplicate ***')
 
    result = checkIfDuplicates_2(listOfElems)
 
    if result:
        print('Yes, list contains duplicates')
    else:
        print('No duplicates found in list')        
 
    print('*** Check if list contains duplicates using list.count() ***')

    result = checkIfDuplicates_3(listOfElems)
 
    if result:
        print('Yes, list contains duplicates')
    else:
        print('No duplicates found in list') 
 
if __name__ == '__main__':
    main()

Output:

*** Check for duplicates in list using Set and comparing sizes ***
Yes, list contains duplicates
*** Check for duplicates in list using Set and looking for first duplicate ***
Yes, list contains duplicates
*** Check if list contains duplicates using list.count() ***
Yes, list contains duplicates

How do I check if an item is duplicated in Python?

How to check for duplicates in a list in Python.
a_list = [1, 2, 1] List with duplicates..
a_set = set(a_list) Convert to set..
contains_duplicates = len(a_list) != len(a_set) Compare lengths..
print(contains_duplicates).

How do you identify duplicates?

Find and remove duplicates.
Select the cells you want to check for duplicates. ... .
Click Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values..
In the box next to values with, pick the formatting you want to apply to the duplicate values, and then click OK..

How do you check if an element appears twice in a list Python?

any(string. count(x)>=2 for x in lst) Will be true if at least 1 item in lst appears twice or more in string.