This tutorial will discuss about unique ways to replace column values by condition in pandas.
Table Of Contents
Preparing DataSet
First we will create a DataFrame with four columns and six rows.
Let’s see the complete example,
import pandas as pd # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'UK', 'Tech' , 7), ('Shanky', 'UK', 'PMO' , 2), ('Shreya', 'UK', 'Design', 2), ('Aadi', 'US', 'Tech', 11), ('Sim', 'US', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df)
Output
Name Location Team Experience 0 Mark US Tech 5 1 Riti UK Tech 7 2 Shanky UK PMO 2 3 Shreya UK Design 2 4 Aadi US Tech 11 5 Sim US Tech 4
Now we will discuss how to replace values in a column of this DataFrame based on certain conditions.
Replace DataFrame column values by condition using [] operator
Syntax is,
Frequently Asked:
df.loc[df[condition, column_name] = replacement_value
It will apply the given condition
on all values of column with name column_name
, and replace the values that satisfies the given condition, with the replacement_value
.
For example,
Let’s replace all the values in column Experience
, which are less than 3 by the value 1, and values which are greater than 3, replace them by 10.
# Replaces values less than 3, by the value 1 in column Experience df.loc[df['Experience'] < 3, 'Experience'] = 1 # Replaces values greater than 3, by value 10 in column Experience df.loc[df['Experience'] >= 3, 'Experience'] = 10 print(df)
It will replace the values in DataFrame based on the given conditions. Final DataFrame will be like,
Name Location Team Experience 0 Mark US Tech 10 1 Riti UK Tech 10 2 Shanky UK PMO 1 3 Shreya UK Design 1 4 Aadi US Tech 10 5 Sim US Tech 10
Replace DataFrame column values by condition using where()
Select the column, in which we need to replace values by condition. Call where()
function on that column, and pass condition
as first parameter, a default value
as second parameter.
The where() function will apply the given condition
on each element of column, and values for which condition
fails, will be replaced by the default value
.
# Replace values in column 'Experience', # which are not greater than 5, by the value 6 df['Experience'] = df['Experience'].where(df['Experience'] > 5, 6) print(df)
Output:
Name Location Team Experience 0 Mark US Tech 10 1 Riti UK Tech 10 2 Shanky UK PMO 6 3 Shreya UK Design 6 4 Aadi US Tech 10 5 Sim US Tech 10
It replaced all values in column ‘Experience’, which were less than value 5, by the replacement value 6.
Summary
We learned about different ways to replace column values by condition in Pandas.