In this article we will dicuss different ways to check if a given value exists in the dataframe or not.
First of all, we need to import the pandas module i.e.
import pandas as pd
Let’s create a dataframe,
# List of Tuples empoyees = [('jack', 34, 'Sydney', 155) , ('Riti', 31, 'Delhi' , 177) , ('Aadi', 16, 'Mumbai', 81) , ('Mohit', 31,'Delhi' , 167) , ('Veena', 81, 'Delhi' , 144) , ('Shaunak', 35, 'Mumbai', 135 ), ('Shaun', 35, 'Colombo', 111), ('Riti', 32, 'Colombo', 111), ] # Create a DataFrame object empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Marks']) print('Contents of the dataframe :') print(empDfObj)
Contents of the dataframe :
Name Age City Marks 0 jack 34 Sydney 155 1 Riti 31 Delhi 177 2 Aadi 16 Mumbai 81 3 Mohit 31 Delhi 167 4 Veena 81 Delhi 144 5 Shaunak 35 Mumbai 135 6 Shaun 35 Colombo 111 7 Riti 32 Colombo 111
Now how to check the existence of single or multiple values in dataframe ?
Let’s understand by examples,
Check if a single element exists in DataFrame using in & not in operators
Dataframe class provides a member variable i.e DataFrame.values . It returns a numpy representation of all the values in dataframe.
We can use the in & not in operators on these values to check if a given element exists or not. For example,
Frequently Asked:
- Pandas Tutorial #11 – DataFrame attributes & methods
- Replace NaN with given string in DataFrame in Pandas
- Rename Columns in Pandas DataFrame
- How to get & check data types of Dataframe columns in Python Pandas
Use in operator to check if an element exists in dataframe
Check if 81 exists in the dataframe empDfObj i.e.
# Check if 81 exist in DataFrame if 81 in empDfObj.values: print('Element exists in Dataframe')
Output:
Element exists in Dataframe
Use not in operator to check if an element doesn’t exists in dataframe
Check if ‘Hello’ does not exists in dataframe empDfobj i.e.
# Check if 'Hello' doesn't exist in DataFrame if 'Hello' not in empDfObj.values: print('Element does not exist in Dataframe')
Output:
Element does not exist in Dataframe
Check if multiple elements exists in DataFrame or not using in operator
Suppose we want to check that out of 3 given elements, how many exists in the dataframe ?
To do that we have created a function that accepts a elements to be checked in a list. It then iterates over that list and for each element it checks if that element exists in the dataframe values or not. In the end it returns a dictionary representing the existence of given element in dataframe,
def checkIfValuesExists1(dfObj, listOfValues): ''' Check if given elements exists in dictionary or not. It returns a dictionary of elements as key and thier existence value as bool''' resultDict = {} # Iterate over the list of elements one by one for elem in listOfValues: # Check if the element exists in dataframe values if elem in dfObj.values: resultDict[elem] = True else: resultDict[elem] = False # Returns a dictionary of values & thier existence flag return resultDict
Now let’s use this function to check if 81, ‘hello’ & 167 exists in the dataframe,
# Check if given values exists in the DataFrame or not result = checkIfValuesExists1(empDfObj, [81, 'hello', 167]) print('Dictionary representing if the given keys exists in DataFrame or not : ') print(result)
Output
Dictionary representing if the given keys exists in DataFrame or not : {81: True, 'hello': False, 167: True}
Our function returned the dictionary which shows that 81 & 167 exists in the dataframe but ‘hello’ doesn’t exists in the dataframe.
Now instead of creating a separate function for this small task, we can use Dictionary Comprehension too i.e.
listOfValues = [81, 'hello', 167] # Check if given values exists in the DataFrame or not and collect result using dict comprehension result = {elem: True if elem in empDfObj.values else False for elem in listOfValues} print(result)
Output:
{81: True, 'hello': False, 167: True}
It works in the same fashion and returns a similar dictionary.
Check if elements exists in DataFrame using isin() function
We can also check the existence of single or multiple elements in dataframe using DataFrame.isin() function.
DataFrame.isin(self, values)
Arguments:
- values:
- iterable, Series, DataFrame or dict to be checked for existence.
It returns a bool dataframe representing that each value in the original dataframe matches with anyone of the given values.
Now let’s use isin() to check the existence of elements in dataframe,
Check if a single element exist in Dataframe using isin()
Contents of the dataframe empDfObj are,
Name Age City Marks 0 jack 34 Sydney 155 1 Riti 31 Delhi 177 2 Aadi 16 Mumbai 81 3 Mohit 31 Delhi 167 4 Veena 81 Delhi 144 5 Shaunak 35 Mumbai 135 6 Shaun 35 Colombo 111 7 Riti 32 Colombo 111
Now let’s pass the [81] in isin() i.e.
boolDf = empDfObj.isin([81])
It returns a bool dataframe boolDf , whose contents are,
Name Age City Marks 0 False False False False 1 False False False False 2 False False False True 3 False False False False 4 False True False False 5 False False False False 6 False False False False 7 False False False False
The size of returned bool dataframe will be same as original dataframe but it contains True where 81 exists in the Dataframe.
Now if call any() on this bool array it will return a series showing if a column contains True or not i.e.
empDfObj.isin([81]).any()
It returns a series object,
Name False Age True City False Marks True dtype: bool
It shows the columns Age & Marks contains the True.
Now again call any() on this series object i.e.
empDfObj.isin([81]).any().any()
It returns a bool i.e.
True
It returns a bool value representing that Series contains a True.
So basically,
empDfObj.isin([81]).any().any()
Returns a True as all the values in list exists in the Dataframe. For example,
# Check if 81 exist in Dataframe result = empDfObj.isin([81]).any().any() if result: print('Element exists in Dataframe')
Output:
Element exists in Dataframe
Check if any of the given values exists in the Dataframe
Using above logic we can also check if a Dataframe contains any of the given values. For example, check if dataframe empDfObj contains either 81, ‘hello’ or 167 i.e.
# Check if any of the given value exists in Dataframe result = empDfObj.isin([81, 'hello', 167,]).any().any() if result: print('Any of the Element exists in Dataframe')
Output
Any of the Element exists in Dataframe
It shows that yes our dataframe contains any of the given values.
Complete example is as follows,
import pandas as pd def checkIfValuesExists1(dfObj, listOfValues): ''' Check if given elements exists in dictionary or not. It returns a dictionary of elements as key and thier existence value as bool''' resultDict = {} # Iterate over the list of elements one by one for elem in listOfValues: # Check if the element exists in dataframe values if elem in dfObj.values: resultDict[elem] = True else: resultDict[elem] = False # Returns a dictionary of values & thier existence flag return resultDict def main(): # List of Tuples empoyees = [('jack', 34, 'Sydney', 155) , ('Riti', 31, 'Delhi' , 177) , ('Aadi', 16, 'Mumbai', 81) , ('Mohit', 31,'Delhi' , 167) , ('Veena', 81, 'Delhi' , 144) , ('Shaunak', 35, 'Mumbai', 135 ), ('Shaun', 35, 'Colombo', 111), ('Riti', 32, 'Colombo', 111), ] # Create a DataFrame object empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Marks']) print('Contents of the dataframe :') print(empDfObj) print('**** Check if an element exists in DataFrame using in & not in operators ****') print('** Use in operator to check if an element exists in dataframe **') # Check if 81 exist in DataFrame if 81 in empDfObj.values: print('Element exists in Dataframe') # Check if 'Hello' doesn't exist in DataFrame if 'Hello' not in empDfObj.values: print('Element does not exist in Dataframe') print('**** Check if multiple elements exists in DataFrame****') # Check if given values exists in the DataFrame or not result = checkIfValuesExists1(empDfObj, [81, 'hello', 167]) print('Dictionary representing if the given keys exists in DataFrame or not : ') print(result) listOfValues = [81, 'hello', 167] # Check if given values exists in the DataFrame or not and collect result using dict comprehension result = {elem: True if elem in empDfObj.values else False for elem in listOfValues} print('Dictionary representing if the given keys exists in DataFrame or not : ') print(result) print('**** Check if elements exists in DataFrame using isin() ****') print('Check if a single element exists in DataFrame using isin()') # Get a bool dataframe with True at places where 81 exists boolDf = empDfObj.isin([81]) print(boolDf) print(boolDf.any()) print(boolDf.any().any()) # Check if 81 exist in Dataframe result = empDfObj.isin([81]).any().any() if result: print('Element exists in Dataframe') print('Check if a any of the given element exists in DataFrame using isin()') # Check if any of the given value exists in Dataframe result = empDfObj.isin([81, 'hello', 167,]).any().any() if result: print('Any of the Element exists in Dataframe') if __name__ == '__main__': main()
Output:
Contents of the dataframe : Name Age City Marks 0 jack 34 Sydney 155 1 Riti 31 Delhi 177 2 Aadi 16 Mumbai 81 3 Mohit 31 Delhi 167 4 Veena 81 Delhi 144 5 Shaunak 35 Mumbai 135 6 Shaun 35 Colombo 111 7 Riti 32 Colombo 111 **** Check if an element exists in DataFrame using in & not in operators **** ** Use in operator to check if an element exists in dataframe ** Element exists in Dataframe Element does not exist in Dataframe **** Check if multiple elements exists in DataFrame**** Dictionary representing if the given keys exists in DataFrame or not : {81: True, 'hello': False, 167: True} Dictionary representing if the given keys exists in DataFrame or not : {81: True, 'hello': False, 167: True} **** Check if elements exists in DataFrame using isin() **** Check if a single element exists in DataFrame using isin() Name Age City Marks 0 False False False False 1 False False False False 2 False False False True 3 False False False False 4 False True False False 5 False False False False 6 False False False False 7 False False False False Name False Age True City False Marks True dtype: bool True Element exists in Dataframe Check if a any of the given element exists in DataFrame using isin() Any of the Element exists in Dataframe