Replace NaN with zero in multiple columns in Pandas

This tutorial will discuss about different ways to replace NaN with zero in multiple columns in pandas.

Table Of Contents

Introduction

Suppose we have a DataFrame with several missing or NaN values, in almost all the columns. Like this,

     Name  Level_1 Score  Level_2 Score
0    Mark          123.0          789.0
1     NaN            NaN            NaN
2     NaN          132.0            NaN
3  Shreya          789.0            NaN
4    Aadi            NaN          666.0
5     Sim          890.0            NaN

We want to replace NaN values with 0, in some specific columns of this DataFrame. For example, in columns, “Level_1 Score” and “Level_2 Score” only. Like this,

     Name  Level_1 Score  Level_2 Score
0    Mark          123.0          789.0
1     NaN            0.0            0.0
2     NaN          132.0            0.0
3  Shreya          789.0            0.0
4    Aadi            0.0          666.0
5     Sim          890.0            0.0

We don’t want to touch the remaining columns. Let’s see how to this

Preparing DataSet

Let’s create a DataFrame with missing or NaN values scattered accross all columns.

import pandas as pd
import numpy as np

# List of Tuples
players = [ ('Mark',   123,     789),
            ( np.NaN,  np.NaN,  np.NaN),
            ( np.NaN,  132,     np.NaN),
            ('Shreya', 789,     np.NaN),
            ('Aadi',   np.NaN,  666),
            ('Sim',    890,     np.NaN)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(players,
                  columns=['Name', 'Level_1 Score', 'Level_2 Score'])

print(df)

Output

     Name  Level_1 Score  Level_2 Score
0    Mark          123.0          789.0
1     NaN            NaN            NaN
2     NaN          132.0            NaN
3  Shreya          789.0            NaN
4    Aadi            NaN          666.0
5     Sim          890.0            NaN

Now Let’s replace NaN values with 0 in multiple columns i.e. in columns ‘Level_1 Score’ and ‘Level_2 Score’.

Using apply() to replace NaN with zero in multiple columns

Steps are,

  • Select the specified columns from DataFrame as a sub DataFrame. The call the apply() function on it.
  • Pass a lambda function in the apply() function. It will apply this function on each column of the sub DataFrame i.e. on selected columns from the main DataFrame.
  • Inside this Lambda Function, replace all the NaN values in the given column with zero.
  • Assign these modified copy of columns, back to the original DataFrame.

Syntax is:

df[['column_name_1','column_name2']] = \
    df[['column_name_1','column_name2']].apply(lambda col: col.replace(np.NaN, 0))

Let’s use this logic to replace all the NaNs values in columns ‘Level_1 Score’ and ‘Level_2 Score’ with zero.

# replace all NaN values in Columns
# 'Level_1 Score' and 'Level_2 Score' with zero
df[['Level_1 Score', 'Level_2 Score']] = \
        df[['Level_1 Score', 'Level_2 Score']].apply(lambda x: x.replace(np.nan, 0))

print(df)

Output

     Name  Level_1 Score  Level_2 Score
0    Mark          123.0          789.0
1     NaN            0.0            0.0
2     NaN          132.0            0.0
3  Shreya          789.0            0.0
4    Aadi            0.0          666.0
5     Sim          890.0            0.0

All NaN values in multiple columns i.e. ‘Level_1 Score’ and ‘Level_2 Score’ are replaced by 0.

Summary

We learned about a way to replace NaN values with zeros in multiple columns Columns of a DataFrame. Thanks.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top