Pandas: Dataframe.fillna()

In this article we will discuss how to use Dataframe.fillna() method with examples, like how to replace NaNs values in a complete dataframe or some specific rows/columns.

Syntax of Dataframe.fillna()

In pandas, the Dataframe provides a method fillna()to fill the missing values or NaN values in DataFrame.

fillna( value=None, method=None, axis=None, inplace=False, limit=None, downcast=None,)

Let us look at the different arguments passed in this method.

Arguments:

  • value: Value to the fill holes.
    • Alternately a dictionary / Series / DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame)
  • method: {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
  • axis: {0 or ‘index’, 1 or ‘columns’}
    • Axis along which to fill missing values.
  • inplace: bool, default False
    • If True, fill in place. Note: this will modify any other views on this object.
  • limit: int, default None
    • If there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If the method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled. Must be greater than 0 if not None.
  • downcast: dict, default is None
    • A dict of item->dtype of what to downcast if possible, or the string ‘infer’ which will try to downcast to an appropriate equal type (e.g. float64 to int64 if possible).

Returns:

It returns a Dataframe with updated values if inplace=false, otherwise returns None.

Now let us see some examples of fillna(),

Examples of fillna()

First we will create a dataframe from a dictionary,

import numpy as np
import pandas as pd

# A dictionary with list as values
sample_dict = { 'S1': [10, 20, np.NaN, np.NaN],
                'S2': [5, np.NaN, np.NaN, 29],
                'S3': [15, 20, np.NaN, np.NaN],
                'S4': [21, 22, 23, 25],
                'Subjects': ['Hist', 'Finan', 'Maths', 'Geog']}

# Create a DataFrame from dictionary
df = pd.DataFrame(sample_dict)
# Set column 'Subjects' as Index of DataFrame
df = df.set_index('Subjects')

print(df)

Output:

            S1    S2    S3  S4
Subjects                      
Hist      10.0   5.0  15.0  21
Finan     20.0   NaN  20.0  22
Maths      NaN   NaN   NaN  23
Geog       NaN  29.0   NaN  25

Replace all NaNs in dataframe using fillna()

If we pass only value argument in the fillna() then it will replace all NaNs with that value in the dataframe. For example,

# Replace all NaNs in dataframe with a value
new_df = df.fillna(11)

print(new_df)

Output:

            S1    S2    S3  S4
Subjects                      
Hist      10.0   5.0  15.0  21
Finan     20.0  11.0  20.0  22
Maths     11.0  11.0  11.0  23
Geog      11.0  29.0  11.0  25

Here we didn’t pass the inplace argument, so it returned a new dataframe with updated contents.

Pandas: Apply fillna() on a specific column

In the above dataframe, we want to fill NaN values in the ‘S2’ column, we can simply use fillna() method to do so. For example,

# FIll NaNs in column 'S2' of the DataFrame
df['S2'].fillna(0, inplace=True)

print(df)

Output:

            S1    S2    S3  S4
Subjects                      
Hist      10.0   5.0  15.0  21
Finan     20.0   0.0  20.0  22
Maths      NaN   0.0   NaN  23
Geog       NaN  29.0   NaN  25

Here all the NaN values in the S2 column have been replaced with the value provided in the argument ‘value’ of the fillna() method. Note that we need to explicitly write inplace=True in order to make a permanent change in the dataframe.

Pandas: fillna with another column

We can replace the NaN values of a column with another column by simply assigning values of the other column in the ‘value’ argument.
Here is how we can perform that,

# Fill NaNs in column S3 with values in column S4
df['S3'].fillna(value=df['S4'], inplace=True)

print(df)

Output:

            S1    S2    S3  S4
Subjects                      
Hist      10.0   5.0  15.0  21
Finan     20.0   0.0  20.0  22
Maths      NaN   0.0  23.0  23
Geog       NaN  29.0  25.0  25

Pandas: Replace nan values in a row

To replace NaN values in a row we need to use .loc[‘index name’] to access a row in a dataframe, then we will call the fillna() function on that row i.e.

# Replace Nan Values in row 'Maths'
df.loc['Maths'] = df.loc['Maths'].fillna(value=11)

print(df)

Output:

            S1    S2    S3    S4
Subjects                        
Hist      10.0   5.0  15.0  21.0
Finan     20.0   0.0  20.0  22.0
Maths     11.0   0.0  23.0  23.0
Geog       NaN  29.0  25.0  25.0

Here instead of using inplace=True we are using another way for making the permanent change. We assigned the updated row back to the dataframe.

Pandas: Replace nan with random

We can use the functions from the random module of NumPy to fill NaN values of a specific column with any random values. These are a few functions to generate random numbers.

  • randint(low, high=None, size=None, dtype=int)
    • It Return random integers from `low` (inclusive) to `high` (exclusive).
  • rand()
    • It gives random values between 0 and 1
  • randn()
    • A single float randomly sampled from the normal distribution of mean 0 and variance 1 is returned if no argument is provided.

We will be demonstrating one of these.

# Replace NaN with random values in column S1
df['S2'].fillna(value=np.random.randn(), inplace=True)

print(df)

Output:

            S1    S2    S3    S4
Subjects                        
Hist      10.0   5.0  15.0  21.0
Finan     20.0   0.0  20.0  22.0
Maths     11.0   0.0  23.0  23.0
Geog       NaN  29.0  25.0  25.0

Conclusion:

So, this is how we can use the dataframe.fillna() function to replace NaN with custom values in a dataframe.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top