How to convert floats to ints in Pandas?

In this article, we will look at different ways to convert floats to ints in Pandas. We will look at some exceptional scenarios as well to be taken care of while converting.

Table of Contents

To quickly get started, let’s create a sample dataframe for experimentation. We’ll use the pandas library with some random data.

import pandas as pd
import numpy as np

np.random.seed(101)

# Create a DataFrame object from list of tuples
df = pd.DataFrame(np.random.rand(5,5)*10)

print(df)

Contents of the created dataframe are,

          0         1         2         3         4
0  5.163986  5.706676  0.284742  1.715217  6.852770
1  8.338969  3.069662  8.936131  7.215439  1.899390
2  5.542276  3.521320  1.818924  7.856018  9.654832
3  2.323537  0.835614  6.035484  7.289928  2.762388
4  6.853063  5.178675  0.484845  1.378692  1.869674

Now, let’s look at different ways in which we could convert the floats into ints using pandas.

Advertisements

Convert floats to integers in Pandas using the astype() function

We will use the astype function to convert the entire DataFrame values to int type. Let’s quickly check the existing dtype for the DataFrame.

# checking dtypes
print (df.info())

Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       5 non-null      float64
 1   1       5 non-null      float64
 2   2       5 non-null      float64
 3   3       5 non-null      float64
 4   4       5 non-null      float64
dtypes: float64(5)
memory usage: 328.0 bytes
None

Let’s convert all the columns to int using the astype function as shown below.

# converting all valuee in DataFrame to int
df = df.astype(int)

print (df)
print (df.dtypes)

Output

   0  1  2  3  4
0  5  5  0  1  6
1  8  3  8  7  1
2  5  3  1  7  9
3  2  0  6  7  2
4  6  5  0  1  1
0    int64
1    int64
2    int64
3    int64
4    int64
dtype: object

As observed, all the columns are converted from float to int64 dtype. Note that, astype directly converts the number without any rounding off. In case, we want to round off to the nearest integer, we can use it with the round function.

# converting to int post rounding off
df = df.round().astype(int)

print (df)

Output

    0   1  2  3  4
0  10   5  1  4  1
1   0  10  3  5  6
2   2   5  7  5  1
3   6   1  4  7  6
4   3   9  7  2  9

Note that we could use do the same for specific columns by subsetting the columns first and then using the astype() function.

The complete example is as follows,

import pandas as pd
import numpy as np

np.random.seed(101)

# Create a DataFrame object from list of tuples
df = pd.DataFrame(np.random.rand(5,5)*10)

print(df)

# checking dtypes
print (df.info())

# converting all valuee in DataFrame to int
modDf = df.astype(int)

print (modDf)
print (modDf.dtypes)

# converting to int post rounding off
modDf = df.round().astype(int)

print (modDf)

Output:

          0         1         2         3         4
0  5.163986  5.706676  0.284742  1.715217  6.852770
1  8.338969  3.069662  8.936131  7.215439  1.899390
2  5.542276  3.521320  1.818924  7.856018  9.654832
3  2.323537  0.835614  6.035484  7.289928  2.762388
4  6.853063  5.178675  0.484845  1.378692  1.869674

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       5 non-null      float64
 1   1       5 non-null      float64
 2   2       5 non-null      float64
 3   3       5 non-null      float64
 4   4       5 non-null      float64
dtypes: float64(5)
memory usage: 328.0 bytes
None

   0  1  2  3  4
0  5  5  0  1  6
1  8  3  8  7  1
2  5  3  1  7  9
3  2  0  6  7  2
4  6  5  0  1  1

0    int64
1    int64
2    int64
3    int64
4    int64
dtype: object

   0  1  2  3   4
0  5  6  0  2   7
1  8  3  9  7   2
2  6  4  2  8  10
3  2  1  6  7   3
4  7  5  0  1   2

Convert floats to integers in Pandas using apply() function

Another simple way to convert floats to ints is using the apply method. Let’s quickly implement it on the above raw DataFrame.

# converting to int using apply
print (df.apply(np.int64))

Output

   0  1  2  3  4
0  9  5  0  4  0
1  0  9  3  5  6
2  2  4  7  4  1
3  6  1  3  7  6
4  3  8  6  1  8

We can use the round function here as well to round off before converting to int.

Converting floats to ints in case of missing values in Pandas

We need to be slightly careful in case of missing values wherever we are playing with the column dtypes. Let’s understand by introducing a few missing values in our current DataFrame.

# changing a few values to np.NaN
df.iloc[0,0]=np.NaN
df.iloc[0,3]=np.NaN
df.iloc[2,3]=np.NaN

print (df)

Output

          0         1         2         3         4
0       NaN  5.706676  0.284742       NaN  6.852770
1  8.338969  3.069662  8.936131  7.215439  1.899390
2  5.542276  3.521320  1.818924       NaN  9.654832
3  2.323537  0.835614  6.035484  7.289928  2.762388
4  6.853063  5.178675  0.484845  1.378692  1.869674

Let’s try to use the astype function directly on this DataFrame as below.

print (df.astype(int))

Output

ValueError: Cannot convert non-finite values (NA or inf) to integer

It directly results in a ValueError that it can’t convert non-finite values into integers. Let’s also try the apply method that we discussed above.

print (df.apply(np.int64))

Output

                     0  1  2                    3  4
0 -9223372036854775808  5  0 -9223372036854775808  6
1                    8  3  8                    7  1
2                    5  3  1 -9223372036854775808  9
3                    2  0  6                    7  2
4                    6  5  0                    1  1

The apply() method as converted this np.NaN values into some constant integer “-9223372036854775808” which is even more dangerous because, in the case of a large DataFrame, this might can go unnoticed. Therefore, it is recommended to check for missing values first, fill them with suitable values, and then convert them to ints.

print (df.fillna(0).astype(int))

Output

   0  1  2  3  4
0  0  5  0  0  6
1  8  3  8  7  1
2  5  3  1  0  9
3  2  0  6  7  2
4  6  5  0  1  1

As observed, the missing values are filled with 0 values and then converted to int.

The complete example is as follows,

import pandas as pd
import numpy as np

np.random.seed(101)

# Create a DataFrame object from list of tuples
df = pd.DataFrame(np.random.rand(5,5)*10)

print(df)

# changing a few values to np.NaN
df.iloc[0,0]=np.NaN
df.iloc[0,3]=np.NaN
df.iloc[2,3]=np.NaN

print (df)

print (df.fillna(0).astype(int))

Output:

          0         1         2         3         4
0  5.163986  5.706676  0.284742  1.715217  6.852770
1  8.338969  3.069662  8.936131  7.215439  1.899390
2  5.542276  3.521320  1.818924  7.856018  9.654832
3  2.323537  0.835614  6.035484  7.289928  2.762388
4  6.853063  5.178675  0.484845  1.378692  1.869674

          0         1         2         3         4
0       NaN  5.706676  0.284742       NaN  6.852770
1  8.338969  3.069662  8.936131  7.215439  1.899390
2  5.542276  3.521320  1.818924       NaN  9.654832
3  2.323537  0.835614  6.035484  7.289928  2.762388
4  6.853063  5.178675  0.484845  1.378692  1.869674

   0  1  2  3  4
0  0  5  0  0  6
1  8  3  8  7  1
2  5  3  1  0  9
3  2  0  6  7  2
4  6  5  0  1  1

Summary

Great, you made it! In this article, we have discussed multiple ways to convert the floats to ints in pandas.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top