Pandas : Check if a value exists in a DataFrame using in & not in operator | isin()

In this article we will dicuss different ways to check if a given value exists in the dataframe or not.

First of all, we need to import the pandas module i.e.

import pandas as pd

Let’s create a dataframe,

# List of Tuples
empoyees = [('jack', 34, 'Sydney', 155) ,
            ('Riti', 31, 'Delhi' , 177) ,
            ('Aadi', 16, 'Mumbai', 81) ,
            ('Mohit', 31,'Delhi' , 167) ,
            ('Veena', 81, 'Delhi' , 144) ,
            ('Shaunak', 35, 'Mumbai', 135 ),
            ('Shaun', 35, 'Colombo', 111),
            ('Riti', 32, 'Colombo', 111),
            ]

# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Marks'])

print('Contents of the dataframe :')
print(empDfObj)

Contents of the dataframe :

      Name  Age     City  Marks
0     jack   34   Sydney    155
1     Riti   31    Delhi    177
2     Aadi   16   Mumbai     81
3    Mohit   31    Delhi    167
4    Veena   81    Delhi    144
5  Shaunak   35   Mumbai    135
6    Shaun   35  Colombo    111
7     Riti   32  Colombo    111

Now how to check the existence of single or multiple values in dataframe ?
Let’s understand by examples,

Check if a single element exists in DataFrame using in & not in operators

Dataframe class provides a member variable i.e DataFrame.values . It returns a numpy representation of all the values in dataframe.
We can use the in & not in operators on these values to check if a given element exists or not. For example,

Use in operator to check if an element exists in dataframe

Check if 81 exists in the dataframe empDfObj i.e.

# Check if 81 exist in DataFrame
if 81 in empDfObj.values:
    print('Element exists in Dataframe')

Output:

Element exists in Dataframe

Use not in operator to check if an element doesn’t exists in dataframe

Check if ‘Hello’ does not exists in dataframe empDfobj i.e.

# Check if 'Hello' doesn't exist in DataFrame
if 'Hello' not in empDfObj.values:
    print('Element does not exist in Dataframe')

Output:

Element does not exist in Dataframe

Check if multiple elements exists in DataFrame or not using in operator

Suppose we want to check that out of 3 given elements, how many exists in the dataframe ?

To do that we have created a function that accepts a elements to be checked in a list. It then iterates over that list and for each element it checks if that element exists in the dataframe values or not. In the end it returns a dictionary representing the existence of given element in dataframe,

def checkIfValuesExists1(dfObj, listOfValues):
    ''' Check if given elements exists in dictionary or not.
        It returns a dictionary of elements as key and thier existence value as bool'''
    resultDict = {}
    # Iterate over the list of elements one by one
    for elem in listOfValues:
        # Check if the element exists in dataframe values
        if elem in dfObj.values:
            resultDict[elem] = True
        else:
            resultDict[elem] = False
    # Returns a dictionary of values & thier existence flag        
    return resultDict

Now let’s use this function to check if 81, ‘hello’ & 167 exists in the dataframe,

# Check if given values exists in the DataFrame or not
result = checkIfValuesExists1(empDfObj, [81, 'hello', 167])

print('Dictionary representing if the given keys exists in DataFrame or not : ')
print(result)

Output

Dictionary representing if the given keys exists in DataFrame or not :
{81: True, 'hello': False, 167: True}

Our function returned the dictionary which shows that 81 & 167 exists in the dataframe but ‘hello’ doesn’t exists in the dataframe.

Now instead of creating a separate function for this small task, we can use Dictionary Comprehension too i.e.

listOfValues = [81, 'hello', 167]

# Check if given values exists in the DataFrame or not and collect result using dict comprehension
result = {elem: True if elem in empDfObj.values else False for elem in listOfValues}

print(result)

Output:

{81: True, 'hello': False, 167: True}

It works in the same fashion and returns a similar dictionary.

Check if elements exists in DataFrame using isin() function

We can also check the existence of single or multiple elements in dataframe using DataFrame.isin() function.

DataFrame.isin(self, values)

Arguments:

  • values:
    • iterable, Series, DataFrame or dict to be checked for existence.

It returns a bool dataframe representing that each value in the original dataframe matches with anyone of the given values.

Now let’s use isin() to check the existence of elements in dataframe,

Check if a single element exist in Dataframe using isin()

Contents of the dataframe empDfObj are,

      Name  Age     City  Marks
0     jack   34   Sydney    155
1     Riti   31    Delhi    177
2     Aadi   16   Mumbai     81
3    Mohit   31    Delhi    167
4    Veena   81    Delhi    144
5  Shaunak   35   Mumbai    135
6    Shaun   35  Colombo    111
7     Riti   32  Colombo    111

Now let’s pass the [81] in isin() i.e.

boolDf = empDfObj.isin([81])

It returns a bool dataframe boolDf , whose contents are,

    Name    Age   City  Marks
0  False  False  False  False
1  False  False  False  False
2  False  False  False   True
3  False  False  False  False
4  False   True  False  False
5  False  False  False  False
6  False  False  False  False
7  False  False  False  False

The size of returned bool dataframe will be same as original dataframe but it contains True where 81 exists in the Dataframe.

Now if call any() on this bool array it will return a series showing if a column contains True or not i.e.

empDfObj.isin([81]).any()

It returns a series object,

Name     False
Age       True
City     False
Marks     True
dtype: bool

It shows the columns Age & Marks contains the True.

Now again call any() on this series object i.e.

empDfObj.isin([81]).any().any()

It returns a bool i.e.

True

It returns a bool value representing that Series contains a True.

So basically,

empDfObj.isin([81]).any().any()

Returns a True as all the values in list exists in the Dataframe. For example,

# Check if 81 exist in Dataframe
result = empDfObj.isin([81]).any().any()
if result:
    print('Element exists in Dataframe')

Output:

Element exists in Dataframe

Check if any of the given values exists in the Dataframe

Using above logic we can also check if a Dataframe contains any of the given values. For example, check if dataframe empDfObj contains either 81, ‘hello’ or 167 i.e.

# Check if any of the given value exists in Dataframe
result = empDfObj.isin([81, 'hello', 167,]).any().any()

if result:
    print('Any of the Element exists in Dataframe')

Output

Any of the Element exists in Dataframe

It shows that yes our dataframe contains any of the given values.

Complete example is as follows,

import pandas as pd

def checkIfValuesExists1(dfObj, listOfValues):
    ''' Check if given elements exists in dictionary or not.
        It returns a dictionary of elements as key and thier existence value as bool'''
    resultDict = {}
    # Iterate over the list of elements one by one
    for elem in listOfValues:
        # Check if the element exists in dataframe values
        if elem in dfObj.values:
            resultDict[elem] = True
        else:
            resultDict[elem] = False
    # Returns a dictionary of values & thier existence flag        
    return resultDict

def main():

    # List of Tuples
    empoyees = [('jack', 34, 'Sydney', 155) ,
                ('Riti', 31, 'Delhi' , 177) ,
                ('Aadi', 16, 'Mumbai', 81) ,
                ('Mohit', 31,'Delhi' , 167) ,
                ('Veena', 81, 'Delhi' , 144) ,
                ('Shaunak', 35, 'Mumbai', 135 ),
                ('Shaun', 35, 'Colombo', 111),
                ('Riti', 32, 'Colombo', 111),
                ]

    # Create a DataFrame object
    empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Marks'])

    print('Contents of the dataframe :')
    print(empDfObj)

    print('**** Check if an element exists in DataFrame using in & not in operators ****')
    
    print('** Use in operator to check if an element exists in dataframe **')

    # Check if 81 exist in DataFrame
    if 81 in empDfObj.values:
        print('Element exists in Dataframe')

    # Check if 'Hello' doesn't exist in DataFrame
    if 'Hello' not in empDfObj.values:
        print('Element does not exist in Dataframe')

    print('**** Check if multiple elements exists in DataFrame****')

    # Check if given values exists in the DataFrame or not
    result = checkIfValuesExists1(empDfObj, [81, 'hello', 167])

    print('Dictionary representing if the given keys exists in DataFrame or not : ')
    print(result)

    listOfValues = [81, 'hello', 167]
    # Check if given values exists in the DataFrame or not and collect result using dict comprehension
    result = {elem: True if elem in empDfObj.values else False for elem in listOfValues}

    print('Dictionary representing if the given keys exists in DataFrame or not : ')
    print(result)

    print('**** Check if elements exists in DataFrame using isin() ****')

    print('Check if a single element exists in DataFrame using isin()')

    # Get a bool dataframe with True at places where 81 exists
    boolDf = empDfObj.isin([81]) 

    print(boolDf)
    print(boolDf.any())
    print(boolDf.any().any())


    # Check if 81 exist in Dataframe
    result = empDfObj.isin([81]).any().any()
    if result:
        print('Element exists in Dataframe')

    print('Check if a any of the given element exists in DataFrame using isin()')

    # Check if any of the given value exists in Dataframe
    result = empDfObj.isin([81, 'hello', 167,]).any().any()

    if result:
        print('Any of the Element exists in Dataframe')

if __name__ == '__main__':
    main()

Output:

Contents of the dataframe :
      Name  Age     City  Marks
0     jack   34   Sydney    155
1     Riti   31    Delhi    177
2     Aadi   16   Mumbai     81
3    Mohit   31    Delhi    167
4    Veena   81    Delhi    144
5  Shaunak   35   Mumbai    135
6    Shaun   35  Colombo    111
7     Riti   32  Colombo    111
**** Check if an element exists in DataFrame using in & not in operators ****
** Use in operator to check if an element exists in dataframe **
Element exists in Dataframe
Element does not exist in Dataframe
**** Check if multiple elements exists in DataFrame****
Dictionary representing if the given keys exists in DataFrame or not :
{81: True, 'hello': False, 167: True}
Dictionary representing if the given keys exists in DataFrame or not :
{81: True, 'hello': False, 167: True}
**** Check if elements exists in DataFrame using isin() ****
Check if a single element exists in DataFrame using isin()
    Name    Age   City  Marks
0  False  False  False  False
1  False  False  False  False
2  False  False  False   True
3  False  False  False  False
4  False   True  False  False
5  False  False  False  False
6  False  False  False  False
7  False  False  False  False
Name     False
Age       True
City     False
Marks     True
dtype: bool
True
Element exists in Dataframe
Check if a any of the given element exists in DataFrame using isin()
Any of the Element exists in Dataframe

 

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top