How to get & check data types of Dataframe columns in Python Pandas

In this article we will discuss different ways to fetch the data type of single or multiple columns. Also see how to compare data types of columns and fetch column names based on data types.

Use Dataframe.dtypes to get Data types of columns in Dataframe

In Python’s pandas module Dataframe class provides an attribute to get the data type information of each columns i.e.

Dataframe.dtypes

It returns a series object containing data type information of each column. Let’s use this to find & check data types of columns.

Suppose we have a Dataframe i.e.

# List of Tuples
empoyees = [('jack', 34, 'Sydney', 155),
            ('Riti', 31, 'Delhi', 177.5),
            ('Aadi', 16, 'Mumbai', 81),
            ('Mohit', 31, 'Delhi', 167),
            ('Veena', 12, 'Delhi', 144),
            ('Shaunak', 35, 'Mumbai', 135),
            ('Shaun', 35, 'Colombo', 111)
            ]

# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Marks'])

print(empDfObj)

Contents of the dataframe are,

      Name  Age     City  Marks
0     jack   34   Sydney  155.0
1     Riti   31    Delhi  177.5
2     Aadi   16   Mumbai   81.0
3    Mohit   31    Delhi  167.0
4    Veena   12    Delhi  144.0
5  Shaunak   35   Mumbai  135.0
6    Shaun   35  Colombo  111.0

Let’s fetch the Data type of each column in Dataframe as a Series object,

# Get a Series object containing the data type objects of each column of Dataframe.
# Index of series is column name.
dataTypeSeries = empDfObj.dtypes

print('Data type of each column of Dataframe :')
print(dataTypeSeries)

Output

Data type of each column of Dataframe :
Name      object
Age        int64
City      object
Marks    float64
dtype: object

Index of returned Series object is column name and value column of Series contains the data type of respective column.

Get Data types of Dataframe columns as dictionary

We can convert the Series object returned by Dataframe.dtypes to a dictionary too,

# Get a Dictionary containing the pairs of column names & data type objects.
dataTypeDict = dict(empDfObj.dtypes)

print('Data type of each column of Dataframe :')
print(dataTypeDict)

Output:

Data type of each column of Dataframe :
{'Name': dtype('O'), 'Age': dtype('int64'), 'City': dtype('O'), 'Marks': dtype('float64')}

Get the Data type of a single column in Dataframe

We can also fetch the data type of a single column from series object returned by Dataframe.dtypes i.e.

# get data type of column 'Age'
dataTypeObj = empDfObj.dtypes['Age']

print('Data type of each column Age in the Dataframe :')
print(dataTypeObj)

Output

Data type of each column Age in the Dataframe :
int64

Check if data type of a column is int64 or object etc.

Using Dataframe.dtypes we can fetch the data type of a single column and can check its data type too i.e.

Check if Data type of a column is int64 in Dataframe

# Check the type of column 'Age' is int64
if dataTypeObj == np.int64:
    print("Data type of column 'Age' is int64")

Output

Data type of column 'Age' is int64

Check if Data type of a column is object i.e. string in Dataframe

# Check the type of column 'Name' is object i.e string
if empDfObj.dtypes['Name'] == np.object:
    print("Data type of column 'Name' is object")

Output

Data type of column 'Name' is object

Get list of pandas dataframe column names based on data type

Suppose we want a list of column names whose data type is np.object i.e string. Let’s see how to do that,

# Get  columns whose data type is object i.e. string
filteredColumns = empDfObj.dtypes[empDfObj.dtypes == np.object]

# list of columns whose data type is object i.e. string
listOfColumnNames = list(filteredColumns.index)

print(listOfColumnNames)

Output

['Name', 'City']

We basically filtered the series returned by Dataframe.dtypes by value and then fetched index names i.e. columns names from this filtered series.

Get data types of a dataframe using Dataframe.info()

Dataframe.info() prints a detailed summary of the dataframe. It includes information like

  • Name of columns
  • Data type of columns
  • Rows in dataframe
  • non null entries in each column

Let’s see an example,

# Print complete details about the data frame, it will also print column count, names and data types.
empDfObj.info()

Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 4 columns):
Name     7 non-null object
Age      7 non-null int64
City     7 non-null object
Marks    7 non-null float64
dtypes: float64(1), int64(1), object(2)
memory usage: 208.0+ bytes

It also gives us detail about data types of columns in our dataframe.

Complete example is as follows,

import pandas as pd
import numpy as np


def main():
    # List of Tuples
    empoyees = [('jack', 34, 'Sydney', 155),
                ('Riti', 31, 'Delhi', 177.5),
                ('Aadi', 16, 'Mumbai', 81),
                ('Mohit', 31, 'Delhi', 167),
                ('Veena', 12, 'Delhi', 144),
                ('Shaunak', 35, 'Mumbai', 135),
                ('Shaun', 35, 'Colombo', 111)
                ]

    # Create a DataFrame object
    empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Marks'])

    print("Contents of the Dataframe : ")
    print(empDfObj)

    print('*** Get the Data type of each column in Dataframe ***')

    # Get a Series object containing the data type objects of each column of Dataframe.
    # Index of series is column name.
    dataTypeSeries = empDfObj.dtypes

    print('Data type of each column of Dataframe :')
    print(dataTypeSeries)

    # Get a Dictionary containing the pairs of column names & data type objects.
    dataTypeDict = dict(empDfObj.dtypes)

    print('Data type of each column of Dataframe :')
    print(dataTypeDict)

    print('*** Get the Data type of a single column in Dataframe ***')

    # get data type of column 'Age'
    dataTypeObj = empDfObj.dtypes['Age']

    print('Data type of each column Age in the Dataframe :')
    print(dataTypeObj)

    print('*** Check if Data type of a column is int64 or object etc in Dataframe ***')

    # Check the type of column 'Age' is int64
    if dataTypeObj == np.int64:
        print("Data type of column 'Age' is int64")

    # Check the type of column 'Name' is object i.e string
    if empDfObj.dtypes['Name'] == np.object:
        print("Data type of column 'Name' is object")

    print('** Get list of pandas dataframe columns based on data type **')

    # Get  columns whose data type is object i.e. string
    filteredColumns = empDfObj.dtypes[empDfObj.dtypes == np.object]

    # list of columns whose data type is object i.e. string
    listOfColumnNames = list(filteredColumns.index)

    print(listOfColumnNames)

    print('*** Get the Data type of each column in Dataframe using info() ***')

    # Print complete details about the data frame, it will also print column count, names and data types.
    empDfObj.info()

if __name__ == '__main__':
    main()

Output:

Contents of the Dataframe : 
      Name  Age     City  Marks
0     jack   34   Sydney  155.0
1     Riti   31    Delhi  177.5
2     Aadi   16   Mumbai   81.0
3    Mohit   31    Delhi  167.0
4    Veena   12    Delhi  144.0
5  Shaunak   35   Mumbai  135.0
6    Shaun   35  Colombo  111.0
*** Get the Data type of each column in Dataframe ***
Data type of each column of Dataframe :
Name      object
Age        int64
City      object
Marks    float64
dtype: object
Data type of each column of Dataframe :
{'Name': dtype('O'), 'Age': dtype('int64'), 'City': dtype('O'), 'Marks': dtype('float64')}
*** Get the Data type of a single column in Dataframe ***
Data type of each column Age in the Dataframe :
int64
*** Check if Data type of a column is int64 or object etc in Dataframe ***
Data type of column 'Age' is int64
Data type of column 'Name' is object
** Get list of pandas dataframe columns based on data type **
['Name', 'City']
*** Get the Data type of each column in Dataframe using info() ***
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 4 columns):
Name     7 non-null object
Age      7 non-null int64
City     7 non-null object
Marks    7 non-null float64
dtypes: float64(1), int64(1), object(2)
memory usage: 208.0+ bytes

 

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top