This tutorial will discuss about unique ways to replace multiple values in a pandas column.
Table Of Contents
Introduction
Suppose we have a DataFrame,
Name Location Team Experience 0 Mark US Tech 5 1 Riti UK Tech 7 2 Shanky UK PMO 2 3 Shreya UK Design 2 4 Aadi US Tech 11 5 Sim US Tech 4
We want to replace multiple values in the DataFrame like,
* Replace ‘US’ with the string ‘United States’ in column ‘Location’.
* Replace ‘UK’ with the string ‘United Kingdom in column ‘Location’.
Like this,
Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti United Kingdom Tech 7 2 Shanky United Kingdom PMO 2 3 Shreya United Kingdom Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
Let’s see different ways to do that.
Method 1: Using replace() method
Select DataFrame column Location
, and call replace()
function on it. Pass following arguments to the replace()
function,
Frequently Asked:
- List containing values that need to be replaced. For example,
['US', 'UK']
- List containing the replacement strings. For example, [‘Unites States’, ‘United Kingdom’]
Syntax is,
# Replace 'US' with 'United States' in Column 'Location' # Replace 'UK' with 'United Kingdom' in Column 'Location' df['Location'] = df['Location'].replace( ['US', 'UK'], ['Unites States', 'United Kingdom'])
Output:
Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti United Kingdom Tech 7 2 Shanky United Kingdom PMO 2 3 Shreya United Kingdom Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
It replaced all the occurrences of US
with Unites States
, and UK
with United Kingdom
in the DataFrame column Location
.
Let’s see the complete example,
import pandas as pd import numpy as np # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'UK', 'Tech' , 7), ('Shanky', 'UK', 'PMO' , 2), ('Shreya', 'UK', 'Design', 2), ('Aadi', 'US', 'Tech', 11), ('Sim', 'US', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df) # Replace 'US' with 'United States' in Column 'Location' # Replace 'UK' with 'United Kingdom' in Column 'Location' df['Location'] = df['Location'].replace( ['US', 'UK'], ['Unites States', 'United Kingdom']) print(df)
Output
Name Location Team Experience 0 Mark US Tech 5 1 Riti UK Tech 7 2 Shanky UK PMO 2 3 Shreya UK Design 2 4 Aadi US Tech 11 5 Sim US Tech 4 Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti United Kingdom Tech 7 2 Shanky United Kingdom PMO 2 3 Shreya United Kingdom Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
Method 2: Using map() method
Create a dictionary and populate it with certain key-value pairs i.e.
- Values that need to be replaced in column, as keys in dictionary.
- Replacement Values as the values in dictionary.
Like,
replacementDict = { 'US' : 'Unites States', 'UK' : 'United Kingdom'}
Select DataFrame column Location
, and call map() function on that column. Pass the above created dictionary in it, and assign the modified column back again to the DataFrame.
# Replace 'US' with 'United States' in Column 'Location' # Replace 'UK' with 'United Kingdom' in Column 'Location' df['Location'] = df['Location'].map(replacementDict)
Output:
Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti United Kingdom Tech 7 2 Shanky United Kingdom PMO 2 3 Shreya United Kingdom Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
It will replace all occurrences of US
with ‘United States’ and all occurrences of UK
with United Kingdom
in Column Location
of the DataFrame.
Let’s see the complete example,
import pandas as pd import numpy as np # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'UK', 'Tech' , 7), ('Shanky', 'UK', 'PMO' , 2), ('Shreya', 'UK', 'Design', 2), ('Aadi', 'US', 'Tech', 11), ('Sim', 'US', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df) replacementDict = { 'US' : 'Unites States', 'UK' : 'United Kingdom'} # Replace 'US' with 'United States' in Column 'Location' # Replace 'UK' with 'United Kingdom' in Column 'Location' df['Location'] = df['Location'].map(replacementDict) print(df)
Output
Name Location Team Experience 0 Mark US Tech 5 1 Riti UK Tech 7 2 Shanky UK PMO 2 3 Shreya UK Design 2 4 Aadi US Tech 11 5 Sim US Tech 4 Name Location Team Experience 0 Mark Unites States Tech 5 1 Riti United Kingdom Tech 7 2 Shanky United Kingdom PMO 2 3 Shreya United Kingdom Design 2 4 Aadi Unites States Tech 11 5 Sim Unites States Tech 4
Summary
We learned about different ways to replace multiple values in the DataFrame. Thanks.