Extract Numbers from String in Python

In this article, we will learn to extract the numbers from a given string in Python.

Table Of Contents

What is a String in Python

A String is an array of bytes representing Unicode characters enclosed in single, double or triple quotes. The Enclosed characters can be digits, alphabets or special symbols. A String is just a normal text and is human readable. Strings are immutable in Python. It means that once a string object is defined then it can not be changed.

Here we will have a string that is made up of numbers and alphabets,

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50 in 350 matched.'

# type() will print data type of string_var
print(type(string_var))

OUTPUT :

Advertisements
<class 'str'>

You can see we have a string with some numbers in it. Our job is to extract those numbers using python programming language.

Extract numbers from string using isdigit() in List Comprehension :

In this method we are going to use a combination of three different methods to extract number from a given string. The List Comprehension, isdigit() method and the split() method are the three different methods.

List Comprehension is a condition based shorter syntax through which you can filter values in a new list. Here in this method,

  • The split() method converts the string to list of substrings.
  • List Comprehension iterates over this list of sub-string,
  • During iteration of substrings, isdigit() method helps to check for digits

This we can extract all numbers from a string in a list. Let’s see the complete example,

EXAMPLE :

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'

numbers = [int(new_string) for new_string in str.split(string_var) if new_string.isdigit()]

print(numbers)

# type() will print data type of string_var
print(type(numbers))

OUTPUT :

[10773, 350]
<class 'list'>

Here you can see with the combination of three different methods we have successfully extracted numbers from a string. But this method has a flaw as you can see it doesn’t prints the avg, which is of float data type.

Extract numbers from string using re.findall() method

Now we will use findall() method of the regex module in Python. The re module stands for Regular Expression, which comes bundled with python library.

It uses the backslash character (‘\’) to indicate special forms. The re.findall() scans the given string from left to right and checks if the given string has a specified pattern which may be in the form of digits or any other data type. It return a list with all the matching values.Lets see an example .

EXAMPLE :

import re

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'

x = [float(x) for x in re.findall(r'-?\d+\.?\d*',string_var)]

print(x)

OUTPUT :

[10773.0, 50.58, 350.0]

In above example you can see using re.findall() has returned all the numbers in the str_var in a list x using List Comprehension.

Extract numbers from string using split() and append() methods :

Another alternative through which we can extract numbers from a given string is using a combination of split() and append() function. In this method we will use the split() method to split the given string and append it to a list.

  • split() : A built in function of python used to split string into a list.
  • append() : Built in function of python used to add an item to the end of a list.

Lets see an example of this mehtod.

EXAMPLE :

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'
x = []

# Iterate over the words in a string
for i in string_var.split():
    try:
        # Convert word to float and add in list
        x.append(float(i))
    except ValueError :
        pass

print(x)

OUTPUT :

[10773.0, 50.58, 350.0]

In code above example, you can see how we used both split() and append() methods to extract numbers from str_var. Here we always except a ValueError. If try and except are not used here, then it will throw an error like this:

    x.append(float(i)) 
ValueError: could not convert string to float: 'MSD'

Basically we iterated over all words in a string and for each word we converted it to float and added in list. If any word was not numeric then float() will throw error, which we catched and skipped.

Extract numbers from string using nums_from_string library :

Next method that we will use is get_nums() function of nums_from_string library. This library doesn’t comes pre bundled with Python, so we have to install it.Just type pip insttall nums_from_string in your terminal. After installing this is the most easiest method through which we can extract numbers from string.

Look the code below .

EXAMPLE :

import nums_from_string

string_var = 'MSD scored 10773  runs in ODI cricket at the avg of 50.58 in 350 matched.'
print(nums_from_string.get_nums(string_var))

OUTPUT :

[10773, 50.58, 350]

You can see in above example through nums_from_string we can successfully extract numbers from string without specifying any data type like float or int etc.

Summary

So we have seen four different methods through which we can extract numbers from a string in Python. The most easiest method is get_nums(), which is a function of nums_from_string library. Its only drwaback is that, it doesn’t comes bundled with python and you have to install it. Other methods like isdigit() may not be useful because it dosen’t extracts float type numbers. In method 3 you have to do error handling otherwise it will throw a ValueError. We have used Python 3.10.1 for writing example codes. To check your version write python –version in your terminal.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top