This article will discuss how to replace a substring in a string using regex in python.
Table of Contents
- Syntax of regex.sub()
- Python: Replace all whitespace characters from a string using regex
- Python: Replace fixed size words in a string with XXXX
- Python: Replace all lowercase characters with uppercase and vice-versa
- Python: Replace all special characters in a string
- Python: Replace sub-string in a string with a case-insensitive approach
Python’s regex module provides a function sub() to substitute or replace the occurrences of a given pattern in a string. We are going to use this function to replace sub-strings in a string.
First, let’s have a quick overview of the sub() function,
Syntax of regex.sub()
regex.sub(pattern, replacement, original_string)
Parameters
- pattern: A regular expression pattern string.
- All sub-strings that match this pattern gets replaced.
- replacement: It can be a string or a callable function
- If it is a string, it will replace all sub-string that matched the above pattern.
- If it is a callable function, then for each matched sub-string, this function gets called, and the return value gets used as a replacement string.
- original_string: The original string.
- A copy of this string gets created with the replaced content.
Returns
- Returns a new string obtained by replacing all the occurrences of matched sub-strings (based on pattern).
Let’s use this function to replace some sub-strings in a string.
Frequently Asked:
Python: Replace all whitespace characters from a string using regex
To replace all the whitespace characters in a string with a character (suppose ‘X’) use the regex module’s sub() function. Pass these arguments in the regex.sub() function,
- Pass a regex pattern r’\s+’ as the first argument to the sub() function. It will match all the whitespace characters in a string.
- Pass a character ‘X’ as the second argument (the replacement string).
It will replace all the whitespaces in a string with character ‘X’,
import re org_string = "This is a sample string" # Replace all whitespaces in a string with character X new_string = re.sub(r"\s+", 'X', org_string) print(new_string)
Output:
ThisXisXaXsampleXstring
Python: Replace fixed size words in a string with XXXX
To replace all the four-letter words characters in a string with ‘XXXX’ using the regex module’s sub() function. Pass these arguments in the sub() function
- Pass a regex pattern r’\b\w{4}\b’ as first argument to the sub() function. It will match all the 4 letter words or sub-strings of size 4, in a string.
- Pass a string ‘XXXX’ as the second argument (the replacement string).
It will replace all the 4 letter words in a string with string ‘XXXX’,
import re org_string = "This is a sample string, where is need to be replaced." # Replace all 4 letter words with word XXXX new_string = re.sub(r"\b\w{4}\b", 'XXXX', org_string) print(new_string)
Output:
XXXX is a sample string, where is XXXX to be replaced.
Python: Replace all lowercase characters with uppercase and vice-versa
In a string, replace all lowercase letters to upper case and all upper case letters to lower case.
To do that, pass these arguments in the sub() function
- Pass a regex pattern r’[a-zA-Z]’ as first argument to the sub() function. It will match lowercase and upper case characters in the string.
- Pass a call back function as 2nd argument. This function accepts a match object and fetches the matched string from that. Then reverses the case of that string, i.e., if it is of lower case, then make it upper case. If it is of the upper case, then make it of the lower case.
It will reverse the case of each character in the string,
import re def reverse_case(match_obj): char_elem = match_obj.group(0) if char_elem.islower(): return char_elem.upper() else: return char_elem.lower() org_string = "This is a Sample String" # Replace all lower case characters with upper case and vice-versa new_string = re.sub(r"[a-zA-Z]",reverse_case, org_string) print(new_string)
Output:
tHIS IS A sAMPLE sTRING
We can achieve this in a single line too using a lambda function instead of creating separate function,
import re org_string = "This is a Sample String" # Replace all lower case characters with upper case and vice-versa new_string = re.sub(r"[a-zA-Z]", lambda x : x.group(0).upper() if x.group(0).islower() else x.group(0).lower(), org_string) print(new_string)
Output:
tHIS IS A sAMPLE sTRING
Python: Replace all special characters in a string
To replace all the special characters in a string with ‘X’ using the regex module’s sub() function. Pass these arguments in the sub() function
- Pass a regex pattern as the first argument to the sub() function. This pattern will match all the punctuations or special characters in the string.
- Pass a string ‘X’ as the second argument (the replacement string).
It will replace all the special characters in a string with string ‘X’,
import re import string org_string = "Test&[88]%%$$$#$%-+String" # Regex pattern to match all the special characters pattern = r'[' + string.punctuation + ']' # Replace all special characters in a string with character X new_string = re.sub(pattern, 'X', org_string) print(new_string)
Output:
TestXX88XXXXXXXXXXXString
Python: Replace sub-string in a string with a case-insensitive approach
To do a case insensitive replacement using sub() function, pass the flag re.IGNORECASE in the sub() function,
import re org_string = "This IS a sample string." # Replace sub-string in a string with a case-insensitive approach new_string = re.sub(r'is','**', org_string, flags=re.IGNORECASE) print(new_string)
Output:
Th** ** a sample string.
It will replace all the occurrences of ‘is’ sub-string with ‘XX’, irrespective of the string’s case. For example, in the above example, both ‘is’ and ‘IS’ gets replaced by ‘XX’.
Summary
We can replace sub-strings in a string using the regex module’s sub() function. We need to provide the right pattern to match the sub-strings and the replacement string.
Good code
The replacement function for re.sub(pattern, function, string) is what I needed