This tutorial will discuss about different ways to replace NaN with a given string in Pandas DataFrame.
Table Of Contents
Preparing DataSet
Let’s create a DataFrame with four columns, and six rows. This DataFrame will certain NaN values.
import pandas as pd import numpy as np # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'India', 'Tech' , 7), (np.NaN, np.NaN, 'PMO' , np.NaN), ('Shreya', 'India', 'Design', 2), (np.NaN, 'US', np.NaN, 11), ('Sim', np.NaN, np.NaN, 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df)
Output
Name Location Team Experience 0 Mark US Tech 5.0 1 Riti India Tech 7.0 2 NaN NaN PMO NaN 3 Shreya India Design 2.0 4 NaN US NaN 11.0 5 Sim NaN NaN 4.0
Now we want to replace the NaN values in all the columns of this DataFrame with the a given string. There are different ways to do this. Let’s discuss them one by one.
Replace NaN with a string using fillna()
In Pandas, a DataFrame has a function fillna(value)
, to replace all NaN values in the DataFrame with the given value
. To replace all NaNs with a string
, call the fillna()
function, and pass the string
as value
parameter in it. Also, pass the inplace=True
as the second argument in the fillna()
. It will modify the DataFrame in place.
replacementStr = "Missing" # Replace NaN with the given in whole DataFrame df.fillna(value=replacementStr, inplace=True) print(df)
Output
Frequently Asked:
Name Location Team Experience 0 Mark US Tech 5.0 1 Riti India Tech 7.0 2 Missing Missing PMO Missing 3 Shreya India Design 2.0 4 Missing US Missing 11.0 5 Sim Missing Missing 4.0
It replaced all the NaN values with a string “Mising” in all the columns of DataFrame.
Replace NaN with a string using replace()
Pandas DataFrame provides a function replace()
, to replace all the occurrences of a given value with a replacemenet value. To replace all occurrences of NaN
with a string, pass both the NaN and replacement string as arguments in the replace()
function. Also, pass inplace
as True
, due to which all modifications in DataFrame will be in place.
replacementStr = "Missing" # Replace NaN with a string in whole DataFrame df.replace(np.NaN, replacementStr, inplace=True) print(df)
Output
Name Location Team Experience 0 Mark US Tech 5.0 1 Riti India Tech 7.0 2 Missing Missing PMO Missing 3 Shreya India Design 2.0 4 Missing US Missing 11.0 5 Sim Missing Missing 4.0
It replaced all the NaN values with the string “Missing” in all the columns of DataFrame.
Summary
We learned two different ways to replace NaN with a given string in a complete DataFrame in Pandas.