This tutorial will discuss about unique ways to replace a value in pandas column.
Table Of Contents
Introduction
Suppose we have a DataFrame,
Name Location Team Experience 0 Mark US Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi US Tech 11 5 Sim US Tech 4
We want to replace value ‘US’ with the string ‘United States’ in column ‘Location’ of the DataFrame. Like this,
Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
Let’s see different ways to do that.
Replace values in DataFrame Column Using replace() method
Select the column and call replace()
method on that column. Pass value to be replaced
, and the replacement value
as arguments in it. It will replace the value in a copy of the selected column. Then assign that modified column, back to the original column of DataFrame. Syntax is like this,
# Replace value 'to_be_replaced' with 'replacement_value States' in Column 'column_name' df[column_name] = df[column_name].replace(to_be_replaced, replacement_value)
For example, to replace the value ‘US’ with ‘United States’ in Column ‘Location’ in the DataFrame, use the following syntax,
Frequently Asked:
- Replace NaN with preceding/previous values in Pandas
- Replace NaN with values from another DataFrame in Pandas
- Replace NaN values with next values in Pandas
- Replace NaN with zero in multiple columns in Pandas
# Replace value 'US' with 'United States' in Column 'Location' df['Location'] = df['Location'].replace('US', 'Unites States')
DataFrame contents will be like,
Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
It will replace all the occurrences of ‘US’ with ‘Unites States’ in column ‘Location’.
Let’s see the complete example,
import pandas as pd import numpy as np # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'India', 'Tech' , 7), ('Shanky', 'India', 'PMO' , 2), ('Shreya', 'India', 'Design', 2), ('Aadi', 'US', 'Tech', 11), ('Sim', 'US', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df) # Replace value 'US' with 'United States' in Column 'Location' df['Location'] = df['Location'].replace('US', 'Unites States') print(df)
Output
Name Location Team Experience 0 Mark US Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi US Tech 11 5 Sim US Tech 4 Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
Replace values in DataFrame Column Using map() method
Select the DataFrame column, and call map() function on it with a mapping dictionary. This dictionary will contain the mapping of old and new values. It will replace all the columns values in dataframe which matches a key in the dictionary. The replacement value will be the corresponding value in dictionry. For column values, which dont exist in the dictionary as key, the replacement value will be NaN.
For example, to replace the value ‘US’ with ‘United States’ in Column ‘Location’ in the DataFrame, use the following syntax,
# Replace value 'US' with 'United States' in Column 'Location' df['Location'] = df['Location'].map({'US' : 'United States'})
Output:
Name Location Team Experience 0 Mark United States Tech 5 1 Riti NaN Tech 7 2 Shanky NaN PMO 2 3 Shreya NaN Design 2 4 Aadi United States Tech 11 5 Sim United States Tech 4
We passed a dictionary containing old value and new value, and passed that to the map() method of DataFrame column (Series). It will replace all occurrences of ‘US’ with ‘United States’. But all other values in column became NaN.
Let’s see the complete example,
import pandas as pd # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'India', 'Tech' , 7), ('Shanky', 'India', 'PMO' , 2), ('Shreya', 'India', 'Design', 2), ('Aadi', 'US', 'Tech', 11), ('Sim', 'US', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df) # Replace value 'US' with 'United States' in Column 'Location' df['Location'] = df['Location'].map({'US' : 'United States'}) print(df)
Output
Name Location Team Experience 0 Mark US Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi US Tech 11 5 Sim US Tech 4 Name Location Team Experience 0 Mark United States Tech 5 1 Riti NaN Tech 7 2 Shanky NaN PMO 2 3 Shreya NaN Design 2 4 Aadi United States Tech 11 5 Sim United States Tech 4
Summary
We learned about two different ways to replace values in a Pandas DataFrame column. Thanks.