In this article, we will discuss various methods to replace the column values based on conditions in a pandas DataFrame. Let’s look at the table of contents describing the list of methods.
Table of Contents
Preparing DataSet
To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.
import pandas as pd import numpy as np # List of Tuples employees= [('Shubham', 'India', 'Tech', 5, 4), ('Riti', 'India', 'Design' , 7, 7), ('Shanky', 'India', 'PMO' , 2, 2), ('Shreya', 'India', 'Design' , 2, 0), ('Aadi', 'US', 'PMO', 11, 5), ('Sim', 'US', 'Tech', 4, 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience', 'RelevantExperience'], index = ['A', 'B', 'C', 'D', 'E', 'F']) print(df)
Contents of the created dataframe are,
Name Location Team Experience RelevantExperience A Shubham India Tech 5 4 B Riti India Design 7 7 C Shanky India PMO 2 2 D Shreya India Design 2 0 E Aadi US PMO 11 5 F Sim US Tech 4 4
Method 1: Using .loc property of DataFrame
The loc
property comes in handy whenever we want to filter the DataFrame based on certain conditions. Here, we will use the loc
to filter the DataFrame and assign a new value to the filtered rows. Let’s understand with an example, say, we need to replace the “Tech” value in the “Team” column with “Tech & Data”.
# replace Tech with "Tech & Data" using loc df.loc[df['Team'] == 'Tech', 'Team'] = 'Tech & Data' print (df)
Output
Name Location Team Experience RelevantExperience A Shubham India Tech & Data 5 4 B Riti India Design 7 7 C Shanky India PMO 2 2 D Shreya India Design 2 0 E Aadi US PMO 11 5 F Sim US Tech & Data 4 4
As observed, we first filtered all the rows satisfying the condition and then replaced the value by assigning the new value to the filtered rows.
Frequently Asked:
- Find max column value & return corresponding rows in Pandas
- Replace NaN with values from another DataFrame in Pandas
- Pandas : Convert a DataFrame into a list of rows or columns in python | (list of lists)
- Pandas Tutorial #16 – DataFrame GroupBy
Method 2: Using numpy.where() method
Another method is to use the numpy.where() function to replace values based on the condition. Let’s look at the function syntax and implement it in the above example.
np.where(condition, value if condition is TRUE, value if condition is False)
# replace Tech with "Tech & Data" using np.where df['Team'] = np.where(df['Team'] == 'Tech', 'Tech & Data', df['Team']) print (df)
Output
Name Location Team Experience RelevantExperience A Shubham India Tech & Data 5 4 B Riti India Design 7 7 C Shanky India PMO 2 2 D Shreya India Design 2 0 E Aadi US PMO 11 5 F Sim US Tech & Data 4 4
Method 3: Using DataFrame.where() method
There is a where() method in pandas DataFrame as well. Let’s look at the syntax and implementation here.
pd.DataFrame['column_name'].where(~(condition), other=value if condition is True, inplace=True)
Let’s implement it on the same example discussed above.
# replace Tech with "Tech & Data" using DataFrame.where df['Team'].where(~(df.Team == 'Tech'), other='Tech & Data', inplace=True) print(df)
Output
Name Location Team Experience RelevantExperience A Shubham India Tech & Data 5 4 B Riti India Design 7 7 C Shanky India PMO 2 2 D Shreya India Design 2 0 E Aadi US PMO 11 5 F Sim US Tech & Data 4 4
The inplace=True helps us store the changes directly in the same DataFrame.
Method 4: Using mask() function from pandas
The final method is to use the masking function from pandas which are generally used for replacing the values of any row/column based on certain conditions. Let’s implement using the above example.
# replace Tech with "Tech & Data" using masking df['Team'].mask(lambda col: col == 'Tech', 'Tech & Data', inplace=True) print (df)
Output
Name Location Team Experience RelevantExperience A Shubham India Tech & Data 5 4 B Riti India Design 7 7 C Shanky India PMO 2 2 D Shreya India Design 2 0 E Aadi US PMO 11 5 F Sim US Tech & Data 4 4
As observed, we have got a similar output as the above methods.
Summary
In this article, we have discussed how to replace column values based on conditions in pandas DataFrame. Thanks.