This article will discuss different ways to remove duplicate characters from a string in Python.
Table Of Contents
Supose we have a string,
"Wakanda-Warrior"
We want to delete the duplicate characters from this string and keep the strings in order. The final string should be like,
"Waknd-rio"
There are different ways to do this. Let’s discuss them one by one.
Remove Duplicate Characters from String using set() and sorted()
Pass the string to the set() function. It will return a set of characters containing only unique characters from the given string. Then sort this set by using the str.index() function as the comparator. It will sort the unique characters in a string based on the index positions of characters in the original string. Then join back the sorted unique characters and assign that to the original string variable. This way, you can remove duplicate characters from the string and keep the order as in the original string.
For Example,
strValue = "Wakanda-Warrior" # Remove duplicate characters from string and keep the order strValue = ''.join(sorted(set(strValue), key=strValue.index)) print(strValue)
Output
Waknd-rio
It deleted all the duplicate characters from the string.
Remove Duplicate Characters from String using OrderedDict
Create an OrderedDict dictionary with characters in a string as keys. It will keep unique characters in the dictionary as keys, and will not change the order of unique characters. Then join back the unique characters (OrderedDict Keys) and assign that to the original string variable. This way, we can remove duplicate characters from the string and will also keep the order as in the original string.
For Example,
from collections import OrderedDict strValue = "Wakanda-Warrior" # Remove duplicate characters from string and keep the order strValue = ''.join(OrderedDict.fromkeys(strValue)) print(strValue)
Output
Waknd-rio
It deleted all the duplicate characters from the string.
Remove Duplicate Characters from String using dict
From Python 3.6 onwards, the dict objects maintain the insertion order by default.
Create a dict object with characters in a string as keys. Then join back the unique characters (dict Keys) and assign that to the original string variable. This way, we can remove duplicate characters from the string and keep the order as in the original string. It will keep only unique characters in the dictionary as keys, and if you are using python 3.6 or later, it will not change the order of unique characters.
For Example,
strValue = "Wakanda-Warrior" # Remove duplicate characters from string strValue = ''.join(dict.fromkeys(strValue)) print(strValue)
Output
Waknd-rio
It deleted all the duplicate characters from the string.
Remove Duplicate Characters from String using set
After removing the duplicate characters, if keeping the order of unique characters is not a requirement, we can use this technique.
Pass the string to the set() function. It will return a set of characters containing unique characters from the given string. Then join back these unique characters and assign that to the original string variable. This way, you can remove duplicate characters from the string. But the order of the remaining unique characters will not be the same as in the original string.
For Example,
strValue = "Wakanda-Warrior" # Remove duplicate characters from string strValue = ''.join(set(strValue)) print(strValue)
Output
iWrnkdoa-
It deleted all the duplicate characters from the string.
Summary
We learned about different ways to delete duplicate characters from a string in Python.
Pandas Tutorials -Learn Data Analysis with Python
-
Pandas Tutorial Part #1 - Introduction to Data Analysis with Python
-
Pandas Tutorial Part #2 - Basics of Pandas Series
-
Pandas Tutorial Part #3 - Get & Set Series values
-
Pandas Tutorial Part #4 - Attributes & methods of Pandas Series
-
Pandas Tutorial Part #5 - Add or Remove Pandas Series elements
-
Pandas Tutorial Part #6 - Introduction to DataFrame
-
Pandas Tutorial Part #7 - DataFrame.loc[] - Select Rows / Columns by Indexing
-
Pandas Tutorial Part #8 - DataFrame.iloc[] - Select Rows / Columns by Label Names
-
Pandas Tutorial Part #9 - Filter DataFrame Rows
-
Pandas Tutorial Part #10 - Add/Remove DataFrame Rows & Columns
-
Pandas Tutorial Part #11 - DataFrame attributes & methods
-
Pandas Tutorial Part #12 - Handling Missing Data or NaN values
-
Pandas Tutorial Part #13 - Iterate over Rows & Columns of DataFrame
-
Pandas Tutorial Part #14 - Sorting DataFrame by Rows or Columns
-
Pandas Tutorial Part #15 - Merging or Concatenating DataFrames
-
Pandas Tutorial Part #16 - DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.