How to add an empty column to a DataFrame in Pandas?

In this article, we will discuss multiple ways to add an empty column to a pandas DataFrame.

Table of Contents

Preparing Dataset for solution

To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.

import pandas as pd
import numpy as np

# List of Tuples
employees= [('Shubham', 'Data Scientist', 'Tech',   5),
            ('Riti', 'Data Engineer', 'Tech' ,   7),
            ('Shanky', 'Program Manager', 'PMO' ,   2),
            ('Shreya', 'Graphic Designer', 'Design' ,   2),
            ('Aadi', 'Backend Developer', 'Tech', 11),
            ('Sim', 'Data Engineer', 'Tech', 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Designation', 'Team', 'Experience'],
                  index=[0, 1, 2, 3, 4, 5])
print(df)

Contents of the created dataframe are,

      Name        Designation    Team  Experience
0  Shubham     Data Scientist    Tech           5
1     Riti      Data Engineer    Tech           7
2   Shanky    Program Manager     PMO           2
3   Shreya   Graphic Designer  Design           2
4     Aadi  Backend Developer    Tech          11
5      Sim      Data Engineer    Tech           4

Using Assignment operator

The simplest way to add an empty column is using the assignment operator. Let’s look at the code below to understand better, here, we are creating a new column but passing nothing to create it as empty.

# Add a new empty column to the DataFrame 
df['new_col'] = None

print (df)

Output

      Name        Designation    Team  Experience new_col
0  Shubham     Data Scientist    Tech           5    None
1     Riti      Data Engineer    Tech           7    None
2   Shanky    Program Manager     PMO           2    None
3   Shreya   Graphic Designer  Design           2    None
4     Aadi  Backend Developer    Tech          11    None
5      Sim      Data Engineer    Tech           4    None

Now the DataFrame contains a new column (new_col) containing no values. Alternatively, we can also assign it to numpy.NaN or empty Series (pandas.Series()) instead of “None” value.

Using assign() function

The DataFrame.assign() function is also used to create a new column in any existing DataFrame. Let’s again create a new column but use the assign function now.

# Add new compy column to DataFrame
# using assign() function
df = df.assign(new_col = pd.Series(dtype='int'))

print (df)

Output

      Name        Designation    Team  Experience  new_col
0  Shubham     Data Scientist    Tech           5      NaN
1     Riti      Data Engineer    Tech           7      NaN
2   Shanky    Program Manager     PMO           2      NaN
3   Shreya   Graphic Designer  Design           2      NaN
4     Aadi  Backend Developer    Tech          11      NaN
5      Sim      Data Engineer    Tech           4      NaN

The assign function takes the column name as the argument, assigned with their values. Note that, we could have also given None or numpy.NaN here instead of empty Series to create an empty column.

Using insert() function

The pandas.DataFrame.insert() is an important function to add a new column in any existing DataFrame. The advantage of using the insert() function is that we can also decide the location of the new column to be added. Let’s add the new empty column between the columns Designation and Team.

# Add empty column at fiven location
# using insert() function
df.insert(2, "new_col", np.NaN) 

print (df)

Output

      Name        Designation  new_col    Team  Experience
0  Shubham     Data Scientist      NaN    Tech           5
1     Riti      Data Engineer      NaN    Tech           7
2   Shanky    Program Manager      NaN     PMO           2
3   Shreya   Graphic Designer      NaN  Design           2
4     Aadi  Backend Developer      NaN    Tech          11
5      Sim      Data Engineer      NaN    Tech           4

The function takes three arguments – column index or position where the new column needs to be added, new column name and their values.

Using reindex() function

The last method is using the reindex function, we can add additional column names in any existing DataFrame to create empty columns. Let’s look at the code below.

# Add empty column using reindex() function
df = df.reindex(columns = df.columns.tolist()+ ['new_col'])

print (df)

Output

      Name        Designation    Team  Experience  new_col
0  Shubham     Data Scientist    Tech           5      NaN
1     Riti      Data Engineer    Tech           7      NaN
2   Shanky    Program Manager     PMO           2      NaN
3   Shreya   Graphic Designer  Design           2      NaN
4     Aadi  Backend Developer    Tech          11      NaN
5      Sim      Data Engineer    Tech           4      NaN

Here, we have reindexed the existing DataFrame with an additional column named “new_col” (we can add multiple columns as well in the list).

Summary

In this article, we have discussed multiple ways to add an empty column to a pandas DataFrame.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top