In this article we will discuss 10 different ways to compare two lists and get their differences i.e. elements which are present in one list but not in another.
Suppose we have two lists,
first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16]
Now there might be some elements which are present in the first list but are missing from the second list. Whereas, there are also some elements which are present in the second list but missing from the first list. We want to compare our two lists and get all these differences.
Like in above mentioned two lists, the differences are,
18, 19, 13, 14, 15
There are multiple ways to compare two lists and get differences. Let’s discuss them one by one,
Using set to get differences between two lists
When we create a set from a list then it contains only unique elements of the list. So let’s convert our lists to sets and then we can subtract these sets to get the differences between them i.e.
# Convert lists to sets first_set = set(first_list) sec_set = set(sec_list) # Get the differences between two sets differences = (first_set - sec_set).union(sec_set - first_set) print('Differences between two lists: ') print(differences)
Output:
{18, 19, 13, 14, 15}
We got the differences between both the lists i.e. elements which are in one list but not in another list. But what just happened here ? Let’s break the above solution into smaller steps to understand what actually happened.
How did it work?
First get elements which are present in first_list but not in sec_list,
# Get elements which are present in first_list but not in sec_list diff1 = set(first_list) - set(sec_list) print(diff1)
Output:
{13, 14, 15}
Then get elements which are present in sec_list but not in first_list,
# Get elements which are present in sec_list but not in first_list diff2 = set(sec_list) - set(first_list) print(diff2)
Output:
{18, 19}
Now add both the result sets to get the complete differences between two lists,
differences = diff1.union(diff2) print(differences)
Output:
{18, 19, 13, 14, 15}
Using set.difference() to get differences between two lists
In the previous solution, instead of subtracting two sets using – operator, we can use the difference() function of the set to get the differences.
So let’s convert our lists to sets and then we can subtract these sets using difference() function to get the differences in two lists i.e.
first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16] # Get elements which are in first_list but not in sec_list diff1 = set(first_list).difference(set(sec_list)) # Get elements which are in sec_list but not in first_list diff2 = set(sec_list).difference(set(first_list)) differences = diff1.union(diff2) print(differences)
Output:
{18, 19, 13, 14, 15}
Compare & get differences between two lists without sets
Instead of converting lists into sets and the comparing. We can iterate over the first list and for each element in that list, check if the second list contains that or not. It will give elements which are present in first list but are missing from second list i.e.
first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16] # Get elements which are in first_list but not in sec_list diff1 = [] for elem in first_list: if elem not in sec_list: diff1.append(elem) print(diff1)
Output:
[13, 14, 15]
Then use the same logic in reverse order i.e. iterate over the second list and for each element in that list, check if the first list contains that or not. It will give elements which are present in second list but are missing from first list i.e.
# Get elements which are in sec_list but not in first_list diff2 = [] for elem in sec_list: if elem not in first_list: diff2.append(elem) print(diff2)
Output:
[18, 19]
Now combine these diff1 and diff2 to get the complete differences between two lists,
differences = diff1 + diff2 print(differences)
Output:
[13, 14, 15, 18, 19]
Use List Comprehension to get differences between two lists
Just like the previous solution, we can iterate over both the lists and look for elements in other lists to get the differences.But for iteration we are going to use list comprehension.
Out lists are,
first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16]
Get elements which are present in the first list, but missing from the second list i.e.
# Get elements which are in first_list but not in sec_list diff1 = [elem for elem in first_list if elem not in sec_list]
Get elements which are present in the second list, but missing from the first list i.e.
# Get elements which are in sec_list but not in first_list diff2 = [elem for elem in sec_list if elem not in first_list]
Now combine these diff1 and diff2 to get the complete differences between the two lists,
differences = diff1 + diff2 print(differences)
Output:
[13, 14, 15, 18, 19]
Using set.symmetric_difference() to get differences between two lists
In all the previous solutions, we got all the differences between two lists in two steps. But using symmetric_difference() we can achieve that in a single step.
set.symmetric_difference(seq)
symmetric_difference() is a member function of set and accepts another sequence as an argument. It returns a new set with elements which are either in calling set object or sequence argument, but not in both. So, basically it returns the differences between both set & list. Let’s use this to get the differences between two lists,
first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16] differences = set(first_list).symmetric_difference(sec_list) print(differences)
Output:
{13, 14, 15, 18, 19}
We converted our first list to a set, then called the symmetric_difference() on that set object and passed the second list as an argument. It returned the differences between them.
Using union() & intersection() to get differences between two lists
First of all, convert both of the lists to sets. Then get the union of both the sets,
first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16] # Convert lists to sets first_set = set(first_list) sec_set = set(sec_list) # Get union of both the sets union = first_set.union(sec_set) print('Union:', union)
Output:
Union: {10, 11, 12, 13, 14, 15, 16, 18, 19}
union() returns a new set with all the distinct elements from both the sets.
Then get the intersection of both the sets,
# Get intersection of both the sets intersection = first_set.intersection(sec_set) print('Intersection:', intersection)
Output:
Intersection: {16, 10, 11, 12}
intersection() returns a new set with all common elements from both the sets.
Now if we subtract all the common elements from all distinct elements then we will get the differences between both of them,
# get the differences differences = union - intersection print(differences)
Output:
{13, 14, 15, 18, 19}
So, this is how we can get the differences between two lists.
Using set & ^ to get differences between two lists
Another quick solution is,
first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16] differences = set(first_list) ^ set(sec_list) print(differences)
Output:
{13, 14, 15, 18, 19}
Using numpy.setdiff1d() to get differences between two lists
numpy.setdiff1d(arr1, arr2, assume_unique=False)
setdiff1d() accepts two arrays as arguments and returns the unique values in arr1 that are not in arr2. So, let’s use this to get the differences between two lists,
import numpy as np first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16] first_arr = np.array(first_list) sec_arr = np.array(sec_list) # Get elements which are in first_list but not in sec_list diff1 = np.setdiff1d(first_arr, sec_arr) # Get elements which are in sec_list but not in first_list diff2 = np.setdiff1d(sec_arr, first_arr) differences = np.concatenate(( diff1, diff2)) print(differences)
Output:
[13 14 15 18 19]
We converted our lists to ndarrays and passed them to setdiff1d() two times i.e.
- To get the elements which are in first_list but not in sec_list.
- To get the elements which are in sec_list but not in first_list.
Then merged both the differences to get all the differences between two lists.
So, these were the different ways to compare two lists in python and get their differences.
The complete example is as follows,
import numpy as np def main(): first_list = [10, 11, 12, 13, 14, 16, 15] sec_list = [10, 11, 12, 18, 19, 16] print('*** Using set to get differences between two lists *** ') # Convert lists to sets first_set = set(first_list) sec_set = set(sec_list) # Get the differences between two sets differences = (first_set - sec_set).union(sec_set - first_set) print('Differences between two lists: ') print(differences) print('How did it work ?') print('Step by Step:') # Get elements which are present in first_list but not in sec_list diff1 = set(first_list) - set(sec_list) # Get elements which are present in sec_list but not in first_list diff2 = set(sec_list) - set(first_list) print('Elements which are in first_list but not in sec_list: ') print(diff1) print('Elements which are in first_list but not in sec_list: ') print(diff2) differences = diff1.union(diff2) print('Differences between two lists: ') print(differences) print('*** Using set.difference() to get differences between two lists *** ') # Get elements which are in first_list but not in sec_list diff1 = set(first_list).difference(set(sec_list)) # Get elements which are in sec_list but not in first_list diff2 = set(sec_list).difference(set(first_list)) differences = diff1.union(diff2) print('Differences between two lists: ') print(differences) print('*** Compare & get differences between two lists without sets *** ') # Get elements which are in first_list but not in sec_list diff1 = [] for elem in first_list: if elem not in sec_list: diff1.append(elem) print('Elements which are in first_list but not in sec_list: ') print(diff1) # Get elements which are in sec_list but not in first_list diff2 = [] for elem in sec_list: if elem not in first_list: diff2.append(elem) print('Elements which are in sec_list but not in first_list: ') print(diff2) differences = diff1 + diff2 print('Differences between two lists: ') print(differences) print('*** Use List Comprehension to get differences between two lists *** ') # Get elements which are in first_list but not in sec_list diff1 = [elem for elem in first_list if elem not in sec_list] print('Elements which are in first_list but not in sec_list: ') print(diff1) # Get elements which are in sec_list but not in first_list diff2 = [elem for elem in sec_list if elem not in first_list] print('Elements which are in sec_list but not in first_list: ') print(diff2) differences = diff1 + diff2 print('Differences between two lists: ') print(differences) print('*** Using set.symmetric_difference() to get differences between two lists ***') differences = set(first_list).symmetric_difference(sec_list) print('Differences between two lists: ') print(differences) print('*** Using union() & intersection() to get differences between two lists ***') # Convert lists to sets first_set = set(first_list) sec_set = set(sec_list) # Get union of both the sets union = first_set.union(sec_set) print('Union:', union) # Get intersection of both the sets intersection = first_set.intersection(sec_set) print('Intersection:', intersection) # get the differences differences = union - intersection print('Differences between two lists: ') print(differences) print('*** Using set & ^ to get differences between two lists ***') differences = set(first_list) ^ set(sec_list) print('Differences between two lists: ') print(differences) print('*** Using numpy.setdiff1d() to get differences between two lists ***') first_arr = np.array(first_list) sec_arr = np.array(sec_list) # Get elements which are in first_list but not in sec_list diff1 = np.setdiff1d(first_arr, sec_arr) # Get elements which are in sec_list but not in first_list diff2 = np.setdiff1d(sec_arr, first_arr) differences = np.concatenate(( diff1, diff2)) print('Differences between two lists: ') print(differences) if __name__ == '__main__': main()
Output:
*** Using set to get differences between two lists *** Differences between two lists: {18, 19, 13, 14, 15} How did it work ? Step by Step: Elements which are in first_list but not in sec_list: {13, 14, 15} Elements which are in first_list but not in sec_list: {18, 19} Differences between two lists: {18, 19, 13, 14, 15} *** Using set.difference() to get differences between two lists *** Differences between two lists: {18, 19, 13, 14, 15} *** Compare & get differences between two lists without sets *** Elements which are in first_list but not in sec_list: [13, 14, 15] Elements which are in sec_list but not in first_list: [18, 19] Differences between two lists: [13, 14, 15, 18, 19] *** Use List Comprehension to get differences between two lists *** Elements which are in first_list but not in sec_list: [13, 14, 15] Elements which are in sec_list but not in first_list: [18, 19] Differences between two lists: [13, 14, 15, 18, 19] *** Using set.symmetric_difference() to get differences between two lists *** Differences between two lists: {13, 14, 15, 18, 19} *** Using union() & intersection() to get differences between two lists *** Union: {10, 11, 12, 13, 14, 15, 16, 18, 19} Intersection: {16, 10, 11, 12} Differences between two lists: {13, 14, 15, 18, 19} *** Using set & ^ to get differences between two lists *** Differences between two lists: {13, 14, 15, 18, 19} *** Using numpy.setdiff1d() to get differences between two lists *** Differences between two lists: [13 14 15 18 19]
Pandas Tutorials -Learn Data Analysis with Python
-
Pandas Tutorial Part #1 - Introduction to Data Analysis with Python
-
Pandas Tutorial Part #2 - Basics of Pandas Series
-
Pandas Tutorial Part #3 - Get & Set Series values
-
Pandas Tutorial Part #4 - Attributes & methods of Pandas Series
-
Pandas Tutorial Part #5 - Add or Remove Pandas Series elements
-
Pandas Tutorial Part #6 - Introduction to DataFrame
-
Pandas Tutorial Part #7 - DataFrame.loc[] - Select Rows / Columns by Indexing
-
Pandas Tutorial Part #8 - DataFrame.iloc[] - Select Rows / Columns by Label Names
-
Pandas Tutorial Part #9 - Filter DataFrame Rows
-
Pandas Tutorial Part #10 - Add/Remove DataFrame Rows & Columns
-
Pandas Tutorial Part #11 - DataFrame attributes & methods
-
Pandas Tutorial Part #12 - Handling Missing Data or NaN values
-
Pandas Tutorial Part #13 - Iterate over Rows & Columns of DataFrame
-
Pandas Tutorial Part #14 - Sorting DataFrame by Rows or Columns
-
Pandas Tutorial Part #15 - Merging or Concatenating DataFrames
-
Pandas Tutorial Part #16 - DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.