Create an empty DataFrame with just column names

In this article, we will discuss how to create an empty pandas DataFrame with just column names.

Table of Contents

Before getting started, let’s import the pandas and NumPy library which we will be using in this tutorial.

# importing libraries
import pandas as pd
import numpy as np

Using pandas DataFrame

pandas.DataFrame is the method to create DataFrame easily. In order to create an empty DataFrame, all we need to do is pass the names of the columns required. Let’s look at the example below.

import pandas as pd

# create an empty with 4 columns
df = pd.DataFrame(columns=['col_' + str(i) for i in range(4)])

print(df)

Output

Advertisements
Empty DataFrame
Columns: [col_0, col_1, col_2, col_3]
Index: []

We have an empty DataFrame with all the columns specified. Let’s look at the dtypes of all the columns.

# check dtypes
print (df.dtypes)

Output

col_0    object
col_1    object
col_2    object
col_3    object
dtype: object

By default, pandas will create columns with object dtype. In case we want to initiate it with something else, we will need to define it using empty pandas.Series as below.

import pandas as pd

# create empty DataFrame with specific column types
df = pd.DataFrame({'col_0': pd.Series(dtype='str'),
                   'col_1': pd.Series(dtype='int'),
                   'col_2': pd.Series(dtype='str'),
                   'col_3': pd.Series(dtype='float')})

print (df)
print(df.dtypes)

Output

Empty DataFrame
Columns: [col_0, col_1, col_2, col_3]
Index: []
col_0     object
col_1      int64
col_2     object
col_3    float64
dtype: object

Here you go, we have an empty DataFrame with the columns having a specific dtype.

Using Numpy

We can create an empty DataFrame using the numpy.empty method, which creates an empty object that can be fed into the pandas.DataFrame function. Here also we can define the custom dtypes as required. Let’s look at the implementation below.

import pandas as pd
import numpy as np

# define dtypes
dtypes = np.dtype(
    [
        ("col_0", object),
        ("col_1", int),
        ("col_2", object),
        ("col_3", float)
    ]
)

# create an empty DataFrame
df = pd.DataFrame(np.empty(0, dtype=dtypes))

print (df)
print(df.dtypes)

Output

Empty DataFrame
Columns: [col_0, col_1, col_2, col_3]
Index: []
col_0     object
col_1      int64
col_2     object
col_3    float64
dtype: object

This also yields a similar output as the above method.

Summary

In this article, we have discussed how to create an empty DataFrame with just column names.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top