In this article, we will discuss different ways to find rows with NaN in columns of a Pandas Dataframe.
Pandas – Find Rows with NaN in a specified column
Suppose we have a dataframe like this,
A B C D E F G H I 0 Jack NaN 34 Sydney NaN 5 NaN NaN NaN 1 Riti NaN 31 Delhi NaN 7 NaN NaN NaN 2 Aadi NaN 16 London NaN 11 NaN 3.0 NaN 3 Smriti 12.0 16 London 10.0 11 9.0 NaN 11.0 4 Shaun 23.0 18 London 11.0 12 13.0 13.0 14.0 5 Aadi NaN 16 London NaN 11 NaN 3.0 NaN 6 Mark NaN 41 Delhi NaN 12 NaN 11.0 1.0
We want to select only those rows from this dataframe, where column ‘H’ has NaN value i.e.
A B C D E F G H I 0 Jack NaN 34 Sydney NaN 5 NaN NaN NaN 1 Riti NaN 31 Delhi NaN 7 NaN NaN NaN 3 Smriti 12.0 16 London 10.0 11 9.0 NaN 11.0
To do that we need only a single line of code i.e.
selected_rows = df[df['H'].isnull()]
It will return a dataframe containing only those rows where column ‘H’ contains the NaN values.
How did it work?
Although it is one line code but it can be little tricky to understand. So, let’s break this code into simple steps. It will help us understand what is actually happening behind the scenes.
Steps to select only those rows from a dataframe, where a given column contains the NaN values are as follows,
 Step 1: Select the dataframe column ‘H’ as a Series using the [] operator i.e. df[‘H’].
 Step 2: Then Call the isnull() function of Series object like df[‘H’].isnull(). It returns a same sized bool series containing True or False. A True value in bool Series indicates that corresponding value in df[‘H’] is NaN, whereas False indicates the non NaN value.
 Step 3: Pass this bool series to [] operator of dataframe like df[df[‘H’].isnull()]. It will return only those rows from dataframe where the corresponding value in bool series is True.
Let’s see a complete example,
import pandas as pd import numpy as np # List of Tuples empoyees = [('Jack', np.NaN, 34, 'Sydney', np.NaN, 5, np.NaN, np.NaN, np.NaN), ('Riti', np.NaN, 31, 'Delhi' , np.NaN, 7, np.NaN, np.NaN, np.NaN), ('Aadi', np.NaN, 16, 'London', np.NaN, 11, np.NaN, 3, np.NaN), ('Smriti', 12 , 16, 'London', 10, 11, 9, np.NaN, 11), ('Shaun', 23 , 18, 'London', 11, 12, 13, 13, 14), ('Aadi', np.NaN, 16, 'London', np.NaN, 11, np.NaN, 3, np.NaN), ('Mark', np.NaN, 41, 'Delhi' , np.NaN, 12, np.NaN, 11, 1)] # Create a DataFrame object df = pd.DataFrame( empoyees, columns=['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I']) print("Contents of the Dataframe : ") print(df) # Select rows where column 'H' has NaN value selected_rows = df[df['H'].isnull()] print('Selected rows') print(selected_rows)
Output:
Contents of the Dataframe : A B C D E F G H I 0 Jack NaN 34 Sydney NaN 5 NaN NaN NaN 1 Riti NaN 31 Delhi NaN 7 NaN NaN NaN 2 Aadi NaN 16 London NaN 11 NaN 3.0 NaN 3 Smriti 12.0 16 London 10.0 11 9.0 NaN 11.0 4 Shaun 23.0 18 London 11.0 12 13.0 13.0 14.0 5 Aadi NaN 16 London NaN 11 NaN 3.0 NaN 6 Mark NaN 41 Delhi NaN 12 NaN 11.0 1.0 Selected rows A B C D E F G H I 0 Jack NaN 34 Sydney NaN 5 NaN NaN NaN 1 Riti NaN 31 Delhi NaN 7 NaN NaN NaN 3 Smriti 12.0 16 London 10.0 11 9.0 NaN 11.0
Select dataframe rows with NaN in a specified column using isna()
In pandas isna() function of Series is an alias of isnull(). So, you can use this also to select the rows with NaN in a specified column i.e.
# Select rows where column 'H' has NaN value selected_rows = df[df['H'].isna()] print('Selected rows') print(selected_rows)
It is exactly similar to previous solution becuase isna() is an alias of isnull().
Related Posts:
Select Dataframe Rows with NaN in multiple columns
Suppose we have a dataframe like this,
A B C D E F G H I 0 Jack NaN 34 Sydney NaN 5 NaN NaN NaN 1 Riti NaN 31 Delhi NaN 7 NaN NaN NaN 2 Aadi NaN 16 London NaN 11 NaN 3.0 NaN 3 Smriti 12.0 16 London 10.0 11 9.0 NaN 11.0 4 Shaun 23.0 18 London 11.0 12 13.0 13.0 14.0 5 Aadi NaN 16 London NaN 11 NaN 3.0 NaN 6 Mark NaN 41 Delhi NaN 12 NaN 11.0 1.0
We want to select only those rows from this dataframe, where columns ‘H’ and ‘I’ have NaN values i.e.
A B C D E F G H I 0 Jack NaN 34 Sydney NaN 5 NaN NaN NaN 1 Riti NaN 31 Delhi NaN 7 NaN NaN NaN
This can be done in a single line of code i.e.
# Select rows where column 'H' & 'I' have NaN value selected_rows = df[df['H'].isnull() & df['I'].isnull()]
It returns a dataframe containing only those rows where both the columns ‘H’ & ‘I’ contain the NaN values.
How did it work?
Although it is single line code and quite similar to first solution, but it is little hard to understand. So, let’s break this code into simple steps. It will help us clear some more concepts.
Steps to select only those rows from a dataframe, where a specific columns contains the NaN values are as follows,
 Step 1: Select the dataframe column ‘H’ as a Series using the [] operator i.e. df[‘H’].
 Step 2: Then Call the isnull() function of Series object like df[‘H’].isnull(). It returns a same sized bool series containing True or False. The True value in bool Series indicates that corresponding value in df[‘H’] is NaN, whereas False indicates the non NaN value.
 Step 3: Select the dataframe column ‘I’ as a Series using the [] operator i.e. df[‘I’].
 Step 4 Then Call the isnull() function of Series object like df[‘I’].isnull(). It returns a same sized bool series containing True or False. The True value in bool Series indicates that corresponding value in df[‘I’] is NaN, whereas False indicates the non NaN value.
 Step 5: Apply the AND operator on both the bool series created in step 2 and step 4 i.e. (df[‘H’].isnull() & df[‘I’].isnull() ). It returns a final bool series. True value in this final bool Series indicates that the corresponding values in column ‘H’ and ‘I’ are NaN, where as False indicates the nonNaN value in any one of the two columns for that row.
 Pass this final bool series to [] operator of dataframe like df[df[‘H’].isnull() & df[‘I’].isnull()]. It will return only those rows where the values in both the columns ‘H’ and ‘I’ are NaN.
Let’s see a complete example,
import pandas as pd import numpy as np # List of Tuples empoyees = [('Jack', np.NaN, 34, 'Sydney', np.NaN, 5, np.NaN, np.NaN, np.NaN), ('Riti', np.NaN, 31, 'Delhi' , np.NaN, 7, np.NaN, np.NaN, np.NaN), ('Aadi', np.NaN, 16, 'London', np.NaN, 11, np.NaN, 3, np.NaN), ('Smriti', 12 , 16, 'London', 10, 11, 9, np.NaN, 11), ('Shaun', 23 , 18, 'London', 11, 12, 13, 13, 14), ('Aadi', np.NaN, 16, 'London', np.NaN, 11, np.NaN, 3, np.NaN), ('Mark', np.NaN, 41, 'Delhi' , np.NaN, 12, np.NaN, 11, 1)] # Create a DataFrame object df = pd.DataFrame( empoyees, columns=['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I']) print("Contents of the Dataframe : ") print(df) # Select rows where column 'H' & 'I' have NaN value selected_rows = df[df['H'].isnull() & df['I'].isnull()] print('Selected rows') print(selected_rows)
Output:
A B C D E F G H I 0 Jack NaN 34 Sydney NaN 5 NaN NaN NaN 1 Riti NaN 31 Delhi NaN 7 NaN NaN NaN
Summary:
We learned about the different ways to select rows from a dataframe where a single or multiple columns contain NaN or missing values.
Pandas Tutorials Learn Data Analysis with Python

Pandas Tutorial Part #1  Introduction to Data Analysis with Python

Pandas Tutorial Part #2  Basics of Pandas Series

Pandas Tutorial Part #3  Get & Set Series values

Pandas Tutorial Part #4  Attributes & methods of Pandas Series

Pandas Tutorial Part #5  Add or Remove Pandas Series elements

Pandas Tutorial Part #6  Introduction to DataFrame

Pandas Tutorial Part #7  DataFrame.loc[]  Select Rows / Columns by Indexing

Pandas Tutorial Part #8  DataFrame.iloc[]  Select Rows / Columns by Label Names

Pandas Tutorial Part #9  Filter DataFrame Rows

Pandas Tutorial Part #10  Add/Remove DataFrame Rows & Columns

Pandas Tutorial Part #11  DataFrame attributes & methods

Pandas Tutorial Part #12  Handling Missing Data or NaN values

Pandas Tutorial Part #13  Iterate over Rows & Columns of DataFrame

Pandas Tutorial Part #14  Sorting DataFrame by Rows or Columns

Pandas Tutorial Part #15  Merging or Concatenating DataFrames

Pandas Tutorial Part #16  DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most soughtafter professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.