Replace column values by condition in Pandas

This tutorial will discuss about unique ways to replace column values by condition in pandas.

Table Of Contents

Preparing DataSet

First we will create a DataFrame with four columns and six rows.

Let’s see the complete example,

import pandas as pd

# List of Tuples
employees= [('Mark',   'US', 'Tech',   5),
            ('Riti',   'UK', 'Tech' ,  7),
            ('Shanky', 'UK', 'PMO' ,   2),
            ('Shreya', 'UK', 'Design', 2),
            ('Aadi',   'US', 'Tech',   11),
            ('Sim',    'US', 'Tech',   4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Location', 'Team', 'Experience'])

print(df)

Output

     Name Location    Team  Experience
0    Mark       US    Tech           5
1    Riti       UK    Tech           7
2  Shanky       UK     PMO           2
3  Shreya       UK  Design           2
4    Aadi       US    Tech          11
5     Sim       US    Tech           4

Now we will discuss how to replace values in a column of this DataFrame based on certain conditions.

Replace DataFrame column values by condition using [] operator

Syntax is,

df.loc[df[condition, column_name] = replacement_value

It will apply the given condition on all values of column with name column_name, and replace the values that satisfies the given condition, with the replacement_value.

For example,

Let’s replace all the values in column Experience, which are less than 3 by the value 1, and values which are greater than 3, replace them by 10.

# Replaces values less than 3, by the value 1 in column Experience
df.loc[df['Experience'] < 3, 'Experience'] = 1

# Replaces values greater than 3, by value 10 in column Experience
df.loc[df['Experience'] >= 3, 'Experience'] = 10

print(df)

It will replace the values in DataFrame based on the given conditions. Final DataFrame will be like,

     Name Location    Team  Experience
0    Mark       US    Tech          10
1    Riti       UK    Tech          10
2  Shanky       UK     PMO           1
3  Shreya       UK  Design           1
4    Aadi       US    Tech          10
5     Sim       US    Tech          10

Replace DataFrame column values by condition using where()

Select the column, in which we need to replace values by condition. Call where() function on that column, and pass condition as first parameter, a default value as second parameter.

The where() function will apply the given condition on each element of column, and values for which condition fails, will be replaced by the default value.

# Replace values in column 'Experience',
# which are not greater than 5, by the value 6
df['Experience'] = df['Experience'].where(df['Experience'] > 5, 6)

print(df)

Output:

     Name Location    Team  Experience
0    Mark       US    Tech          10
1    Riti       UK    Tech          10
2  Shanky       UK     PMO           6
3  Shreya       UK  Design           6
4    Aadi       US    Tech          10
5     Sim       US    Tech          10

It replaced all values in column ‘Experience’, which were less than value 5, by the replacement value 6.

Summary

We learned about different ways to replace column values by condition in Pandas.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top