Convert a Unicode String to a String in Python

In this python tutorial, you will learn how to convert a Unicode string to a string.

Table Of Contents

A Unicode string that is used to represent the characters in a number system. If we want to specify a Unicode string, we have to place the character – “u” in front of the string.

Example:

u"Hello Varun"

Convert a Unicode string to a string using str()

Here, we will use str() to convert Unicode string to string.

Advertisements

Syntax:

str(inp_str)

It takes only one parameter.

Parameter:

Where inp_str is a Unicode string.
Example 1:

In this example, we will convert the Unicode string – u”Welcome to thisPointer” to a string using str().

# Consider the unicode string
inp_str= u"Welcome to thisPointer"

# Convert to string
print("Converted String: ",str(inp_str))

Output:

Converted String:  Welcome to thisPointer

Convert a Unicode string to UTF-8

Here, we will take a Unicode string and encode it to UTF-8 using the encode() method. The UTF-8 converts each character in the Unicode string into 1 to 4 characters. The conversion depends upon the character.

Syntax:

inp_str.encode('UTF-8')

Where inp_str is the Unicode string.

Example:

In this example, we will convert the Unicode string – u”Welcome to thisPointer” to UTF-8.

# Consider the unicode string
inp_str= u"Welcome to thisPointer"

# Convert unicode string to UTF-8 encoding
inp_str=inp_str.encode('UTF-8')
print("Converted String: ", inp_str)

Output:

Converted String:  b'Welcome to thisPointer'

From the above string, it takes 1 character to convert from Unicode to UTF-8. Suppose, if you want to revert the Unicode string, then you can use the decode() method.

Syntax:

inp_str.decode('UTF-8')

Example:
In this example, we will convert the Unicode string – u”Welcome to thisPointer” to UTF-8 and again decode it to a unicode string.

# Consider the unicode string
inp_str= u"Welcome to thisPointer"

# Convert unicode string to UTF-8 encoding
inp_str=inp_str.encode('UTF-8')
print("Converted String: ", inp_str)

# Convert back
inp_str=inp_str.decode('UTF-8')
print("Actual String: ", inp_str)

Output:

Converted String:  b'Welcome to thisPointer'
Actual String:  Welcome to thisPointer

Convert a Unicode string to UTF-16

Here, we will take a Unicode string and encode to UTF-16 using encode() method. The UTF-16 converts each character in the Unicode string into mostly 2 bytes.

Syntax:

inp_str.encode('UTF-16')

Where inp_str is the Unicode string.
Example:

In this example, we will convert the Unicode string – u”Welcome to thisPointer” to UTF-16.

# Consider the unicode string
inp_str= u"Welcome to thisPointer"

# Convert unicode string to UTF-16 encoding
inp_str=inp_str.encode('UTF-16')
print("Converted String: ", inp_str)

Output:

Converted String:  b'\xff\xfeW\x00e\x00l\x00c\x00o\x00m\x00e\x00 \x00t\x00o\x00 \x00t\x00h\x00i\x00s\x00P\x00o\x00i\x00n\x00t\x00e\x00r\x00'

From the above string, it returned 2 bytes of each character, if you want to revert the Unicode string, then you can use the decode() method.

Syntax:

inp_str.decode('UTF-16')

Example:

In this example, we will convert the Unicode string – u”Welcome to thisPointer” to UTF-16 and again decode it to a Unicode string.

# Consider the unicode string
inp_str= u"Welcome to thisPointer"

# Convert unicode string to UTF-16 encoding
inp_str=inp_str.encode('UTF-16')
print("Converted String: ", inp_str)

# Convert back
inp_str=inp_str.decode('UTF-16')
print("Actual String: ", inp_str)

Output:

Converted String:  b'\xff\xfeW\x00e\x00l\x00c\x00o\x00m\x00e\x00 \x00t\x00o\x00 \x00t\x00h\x00i\x00s\x00P\x00o\x00i\x00n\x00t\x00e\x00r\x00'
Actual String:  Welcome to thisPointer

Convert a Unicode string to UTF-32

Here, we will take a Unicode string and encode it to UTF-32 using encode() method.UTF-16 converts each character in the Unicode string into mostly 4 bytes.

Syntax:

inp_str.encode('UTF-32')

Where inp_str is the Unicode string.

Example:

In this example, we will convert the Unicode string – u”Welcome to thisPointer” to UTF-32.

# Consider the unicode string
inp_str= u"Welcome to thisPointer"

# Convert unicode string to UTF-32 encoding
inp_str=inp_str.encode('UTF-32')
print("Converted String: ", inp_str)

Output:

Converted String:  b'\xff\xfe\x00\x00W\x00\x00\x00e\x00\x00\x00l\x00\x00\x00c\x00\x00\x00o\x00\x00\x00m\x00\x00\x00e\x00\x00\x00 \x00\x00\x00t\x00\x00\x00o\x00\x00\x00 \x00\x00\x00t\x00\x00\x00h\x00\x00\x00i\x00\x00\x00s\x00\x00\x00P\x00\x00\x00o\x00\x00\x00i\x00\x00\x00n\x00\x00\x00t\x00\x00\x00e\x00\x00\x00r\x00\x00\x00'

From the above string, it returned 4 bytes of each character, if you want to revert the Unicode string, then you can use the decode() method.

Syntax:

inp_str.decode('UTF-32')

Example:

In this example, we will convert the Unicode string – u”Welcome to thisPointer” to UTF-32 and again decode it to a Unicode string.

# Consider the unicode string
inp_str= u"Welcome to thisPointer"

# Convert unicode string to UTF-32 encoding
inp_str=inp_str.encode('UTF-32')
print("Converted String: ", inp_str)

# Convert back
inp_str=inp_str.decode('UTF-32')
print("Actual String: ", inp_str)

Output:

Converted String:  b'\xff\xfe\x00\x00W\x00\x00\x00e\x00\x00\x00l\x00\x00\x00c\x00\x00\x00o\x00\x00\x00m\x00\x00\x00e\x00\x00\x00 \x00\x00\x00t\x00\x00\x00o\x00\x00\x00 \x00\x00\x00t\x00\x00\x00h\x00\x00\x00i\x00\x00\x00s\x00\x00\x00P\x00\x00\x00o\x00\x00\x00i\x00\x00\x00n\x00\x00\x00t\x00\x00\x00e\x00\x00\x00r\x00\x00\x00'
Actual String:  Welcome to thisPointer

Summary

In this Python String article, we have seen how to convert a Unicode string to a string using the str(). Also, we saw how to encode the strings to UTF-8, UTF-16, and UTF-32 with encode() and decode the strings to Unicode strings with decode() method. Happy Learning.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top