This article will discuss checking if all values in a DataFrame column are NaN.
First of all, we will create a DataFrame from a list of tuples,
import pandas as pd import numpy as np # List of Tuples empoyees = [('Jack', np.NaN, 34, 'Sydney', np.NaN, 5), ('Riti', np.NaN, 31, 'Delhi' , np.NaN, 7), ('Aadi', np.NaN, 16, 'London', np.NaN, np.NaN), ('Mark', np.NaN, 41, 'Delhi' , np.NaN, np.NaN)] # Create a DataFrame object df = pd.DataFrame( empoyees, columns=['A', 'B', 'C', 'D', 'E', 'F']) # Display the DataFrame print(df)
Output:
A B C D E F 0 Jack NaN 34 Sydney NaN 5.0 1 Riti NaN 31 Delhi NaN 7.0 2 Aadi NaN 16 London NaN NaN 3 Mark NaN 41 Delhi NaN NaN
This DataFrame has four rows and six columns, out of which two columns (‘B’ & ‘E’) have all NaN values. Let’s see how we can verify if a column contains all NaN values or not in a DataFrame.
Check if all values are NaN in a column
Select the column as a Series object and then use isnull() and all() methods of the Series to verify if all values are NaN or not. The steps are as follows,
- Select the column by name using subscript operator of DataFrame i.e. df[‘column_name’]. It gives the column contents as a Pandas Series object.
- Call the isnull() function of the Series object. It returns a boolean Series of the same size. Each True value in this boolean Series indicates that the corresponding value in the Original Series (selected column) is NaN.
- Check if all values in the boolean Series are True or not. If yes, then it means all values in that column are NaN.
For example, let’s check if all values are NaN in column ‘B’ from the above created DataFrame,
# Check if all values in column 'B' are NaN if df['B'].isnull().all(): print("All values in the column 'B' are NaN") else: print("All values in the column 'B' are not NaN")
Output:
Frequently Asked:
All values in the column 'B' are NaN
We selected the column and then got a boolean series using the isnull() method. Then using the all() function, we checked if all the values in Boolean Series are True or not. If all values are True, then it means that all elements in the column are NaN.
In this example, the ‘B’ column had all values; therefore, the returned boolean Series had all True values, and the Series.all() function returned True in this case. Let’s check out a negative example,
Let’s check if all values are NaN in column ‘F’ in the above created DataFrame,
# Check if all values in column 'F' are NaN if df['F'].isnull().all(): print("All values in the column 'F' are NaN") else: print("All values in the column 'F' are not NaN")
Output:
All values in the column 'F' are not NaN
In this example, all values in column ‘F’ are not NaN; therefore, the returned boolean Series had some True and few False values, and the Series.all() function returned False in this case. It proved that all elements in column ‘F’ are not NaN.
The complete working example is as follows,
import pandas as pd import numpy as np # List of Tuples empoyees = [('Jack', np.NaN, 34, 'Sydney', np.NaN, 5), ('Riti', np.NaN, 31, 'Delhi' , np.NaN, 7), ('Aadi', np.NaN, 16, 'London', np.NaN, np.NaN), ('Mark', np.NaN, 41, 'Delhi' , np.NaN, np.NaN)] # Create a DataFrame object df = pd.DataFrame( empoyees, columns=['A', 'B', 'C', 'D', 'E', 'F']) # Display the DataFrame print(df) # Check if all values in column 'B' are NaN if df['B'].isnull().all(): print("All values in the column 'B' are NaN") else: print("All values in the column 'B' are not NaN")
Output:
A B C D E F 0 Jack NaN 34 Sydney NaN 5.0 1 Riti NaN 31 Delhi NaN 7.0 2 Aadi NaN 16 London NaN NaN 3 Mark NaN 41 Delhi NaN NaN All values in the column 'B' are NaN
Summary
We learned how to check if all values in a DataFrame column are NaN.