This article will discuss how to replace a substring in a string using regex in python.


Table of Contents

Python’s regex module provides a function sub() to substitute or replace the occurrences of a given pattern in a string. We are going to use this function to replace sub-strings in a string.

First, let’s have a quick overview of the sub() function,

Syntax of regex.sub()

regex.sub(pattern, replacement, original_string)

Parameters

  • pattern: A regular expression pattern string.
    • All sub-strings that match this pattern gets replaced.
  • replacement: It can be a string or a callable function
    • If it is a string, it will replace all sub-string that matched the above pattern.
    • If it is a callable function, then for each matched sub-string, this function gets called, and the return value gets used as a replacement string.
  • original_string: The original string.
    • A copy of this string gets created with the replaced content.

Returns

  • Returns a new string obtained by replacing all the occurrences of matched sub-strings (based on pattern).

Let’s use this function to replace some sub-strings in a string.

Python: Replace all whitespace characters from a string using regex

To replace all the whitespace characters in a string with a character (suppose ‘X’) use the regex module’s sub() function. Pass these arguments in the regex.sub() function,

  • Pass a regex pattern r’\s+’ as the first argument to the sub() function. It will match all the whitespace characters in a string.
  • Pass a character ‘X’ as the second argument (the replacement string).

It will replace all the whitespaces in a string with character ‘X’,

import re

org_string = "This is   a sample  string"

# Replace all whitespaces in a string with character X
new_string = re.sub(r"\s+", 'X', org_string)

print(new_string)

Output:

ThisXisXaXsampleXstring

Python: Replace fixed size words in a string with XXXX

To replace all the four-letter words characters in a string with ‘XXXX’ using the regex module’s sub() function. Pass these arguments in the sub() function

  • Pass a regex pattern r’\b\w{4}\b’ as first argument to the sub() function. It will match all the 4 letter words or sub-strings of size 4, in a string.
  • Pass a string ‘XXXX’ as the second argument (the replacement string).

It will replace all the 4 letter words in a string with string ‘XXXX’,

import re

org_string = "This is a sample string, where is need to be replaced."

# Replace all 4 letter words with word XXXX
new_string = re.sub(r"\b\w{4}\b", 'XXXX', org_string)

print(new_string)

Output:

XXXX is a sample string, where is XXXX to be replaced.

Python: Replace all lowercase characters with uppercase and vice-versa

In a string, replace all lowercase letters to upper case and all upper case letters to lower case.
To do that, pass these arguments in the sub() function

  • Pass a regex pattern r’[a-zA-Z]’ as first argument to the sub() function. It will match lowercase and upper case characters in the string.
  • Pass a call back function as 2nd argument. This function accepts a match object and fetches the matched string from that. Then reverses the case of that string, i.e., if it is of lower case, then make it upper case. If it is of the upper case, then make it of the lower case.

It will reverse the case of each character in the string,

import re

def reverse_case(match_obj):
    char_elem = match_obj.group(0)
    if char_elem.islower():
        return char_elem.upper()
    else:
        return char_elem.lower()

org_string = "This is   a Sample  String"

# Replace all lower case characters with upper case and vice-versa
new_string = re.sub(r"[a-zA-Z]",reverse_case, org_string)

print(new_string)

Output:

tHIS IS   A sAMPLE  sTRING

We can achieve this in a single line too using a lambda function instead of creating separate function,

import re

org_string = "This is   a Sample  String"

# Replace all lower case characters with upper case and vice-versa
new_string = re.sub(r"[a-zA-Z]",
                    lambda x :  x.group(0).upper()
                                if x.group(0).islower()
                                else x.group(0).lower(),
                    org_string)

print(new_string)

Output:

tHIS IS   A sAMPLE  sTRING

Python: Replace all special characters in a string

To replace all the special characters in a string with ‘X’ using the regex module’s sub() function. Pass these arguments in the sub() function

  • Pass a regex pattern as the first argument to the sub() function. This pattern will match all the punctuations or special characters in the string.
  • Pass a string ‘X’ as the second argument (the replacement string).

It will replace all the special characters in a string with string ‘X’,

import re
import string

org_string = "Test&[88]%%$$$#$%-+String"

# Regex pattern to match all the special characters
pattern = r'[' + string.punctuation + ']'

# Replace all special characters in a string with character X
new_string = re.sub(pattern, 'X', org_string)

print(new_string)

Output:

TestXX88XXXXXXXXXXXString

Python: Replace sub-string in a string with a case-insensitive approach

To do a case insensitive replacement using sub() function, pass the flag re.IGNORECASE in the sub() function,

import re

org_string = "This IS a sample string."

# Replace sub-string in a string with a case-insensitive approach
new_string = re.sub(r'is','**', org_string, flags=re.IGNORECASE)

print(new_string)

Output:

Th** ** a sample string.

It will replace all the occurrences of ‘is’ sub-string with ‘XX’, irrespective of the string’s case. For example, in the above example, both ‘is’ and ‘IS’ gets replaced by ‘XX’.

Summary

We can replace sub-strings in a string using the regex module’s sub() function. We need to provide the right pattern to match the sub-strings and the replacement string.

Join a list of 2000+ Programmers for latest Tips & Tutorials