Pandas: Drop Rows with All NaN values

In this article, we will discuss how to delete the rows of a dataframe which contain all NaN values or missing values.

Table of Contents

We are going to use the pandas dropna() function. So, first let’s have a little overview of it,

Overview of dataframe.dropna()function

Pandas provide a function to delete rows or columns from a dataframe based on NaN or missing values in it.

DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)

Arguments:

  • axis: Default – 0
    • 0, or ‘index’ : Drop rows which contain NaN values.
    • 1, or ‘columns’ : Drop columns which contain NaN value.
  • how: Default – ‘any’
    • ‘any’ : Drop rows / columns which contain any NaN values.
    • ‘all’ : Drop rows / columns which contain all NaN values.
  • thresh (int): Optional
    • Delete rows/columns which contains less than minimun thresh number of non-NaN values.
  • inplace (bool): Default- False
    • If True, modifies the calling dataframe object

Returns

  • If inplace==True, the return None, else returns a new dataframe by deleting the rows/columns based on NaN values.

Let’s use this to perform our task of deleting rows with all NaN values.

Pandas: Delete rows of dataframe with all NaN values

Suppose we have a dataframe that contains few rows with all NaN values,

Contents of the Dataframe :
      0     1       2     3
0  Jack  34.0  Sydney   5.0
1  Riti  31.0   Delhi   NaN
2   NaN   NaN     NaN   NaN
3  Aadi  16.0  London  11.0
4  Mark   NaN   Delhi  12.0
5   NaN   NaN     NaN   NaN

Now we want to delete all those rows from this dataframe which contains all NaN values (rows with index 2 and 5). So, new dataframe should be like this,

      0     1       2     3
0  Jack  34.0  Sydney   5.0
1  Riti  31.0   Delhi   NaN
3  Aadi  16.0  London  11.0
4  Mark   NaN   Delhi  12.0

For this we can use a pandas dropna() function. It can delete the rows / columns of a dataframe that contains all or few NaN values. As we want to delete the rows that contains all NaN values, so we will pass following arguments in it,

# Drop rows which contain all NaN values
df = df.dropna(axis=0, how='all')
  • axis=0 : Drop rows which contain NaN or missing value.
  • how=’all’ : If all values are NaN, then drop those rows (because axis==0).

It returned a dataframe after deleting the rows with all NaN values and then we assigned that dataframe to the same variable.

Checkout complete example as follows,

import pandas as pd
import numpy as np

# List of Tuples
empoyees = [('Jack', 34,    'Sydney', 5) ,
            ('Riti', 31,    'Delhi' , np.NaN) ,
            (np.NaN, np.NaN, np.NaN , np.NaN),
            ('Aadi', 16,    'London', 11) ,
            ('Mark', np.NaN,'Delhi' , 12),
            (np.NaN, np.NaN, np.NaN , np.NaN)]

# Create a DataFrame object
df = pd.DataFrame(  empoyees)

print("Contents of the Dataframe : ")
print(df)

# Drop rows which contain all NaN values
df = df.dropna( axis=0, 
                how='all')

print("Modified Dataframe : ")
print(df)

Output:

Contents of the Dataframe :
      0     1       2     3
0  Jack  34.0  Sydney   5.0
1  Riti  31.0   Delhi   NaN
2   NaN   NaN     NaN   NaN
3  Aadi  16.0  London  11.0
4  Mark   NaN   Delhi  12.0
5   NaN   NaN     NaN   NaN

Modified Dataframe :
      0     1       2     3
0  Jack  34.0  Sydney   5.0
1  Riti  31.0   Delhi   NaN
3  Aadi  16.0  London  11.0
4  Mark   NaN   Delhi  12.0

It deleted rows with index 2 and 5 of dataframe, because they had all NaN values.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top