How to Add new Column from List in Pandas?

In this article, we will discuss different ways to add a new column from a list in Pandas DataFrame.

Table Of Contents

Preparing DataSet

First we will create a DataFrame from list of tuples i.e.

import pandas as pd
import numpy as np

# List of Tuples
employees= [('Mark', 'US', 'Tech',   5),
            ('Riti', 'India', 'Tech' ,   7),
            ('Shanky', 'India', 'PMO' ,   2),
            ('Shreya', 'India', 'Design' ,   2),
            ('Aadi', 'US', 'Tech', 11),
            ('Sim', 'US', 'Tech', 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Location', 'Team', 'Experience'])
print(df)

Output:

     Name Location    Team  Experience
0    Mark       US    Tech           5
1    Riti    India    Tech           7
2  Shanky    India     PMO           2
3  Shreya    India  Design           2
4    Aadi       US    Tech          11
5     Sim       US    Tech           4

Now we will learn about different ways to add a new column from a list in this DataFrame.

Advertisements

Method 1: Using [] Opeartor

Suppose we have list of values, and the size of list is equal to the number of rows in the DataFrame i.e.

[31000, 33000, 56500, 43100, 41000, 31333]

Now to add this list as a new column in DataFrame, we can directly assign this list to the df['New Column']. Let’s see an example,

listOfValues = [31000, 33000, 56500, 43100, 41000, 31333]

df['Bonus1'] = listOfValues

print(df)

Output:

     Name Location    Team  Experience  Bonus1
0    Mark       US    Tech           5   31000
1    Riti    India    Tech           7   33000
2  Shanky    India     PMO           2   56500
3  Shreya    India  Design           2   43100
4    Aadi       US    Tech          11   41000
5     Sim       US    Tech           4   31333

It added the list values as a new column ‘Bonus1’ in the Dataframe. But please make sure that the size of list is same as the number of rows in DataFrame, otherwise you will get an error. For example,

listOfValues = [31000, 33000, 56500, 43100]

df['Bonus2'] = listOfValues

print(df)

It will give an error,

ValueError: Length of values (4) does not match length of index (6)

To handle this kind of issue, we need to make the list of similar size as number of rows in column. For missing values, we can add NaN in it. Let’s see an example,

listOfValues = [31000, 33000, 56500, 43100]

if len(listOfValues) < df.shape[0]:
    listOfValues += [np.NaN]*( df.shape[0] - len(listOfValues))

df['Bonus2'] = listOfValues

print(df)

Output:

     Name Location    Team  Experience  Bonus1   Bonus2
0    Mark       US    Tech           5   31000  31000.0
1    Riti    India    Tech           7   33000  33000.0
2  Shanky    India     PMO           2   56500  56500.0
3  Shreya    India  Design           2   43100  43100.0
4    Aadi       US    Tech          11   41000      NaN
5     Sim       US    Tech           4   31333      NaN

It added a list as a column, and also used np.NaN for missing values in list.

Method 2: Using insert() function

If we want to add list as a column at a specific poistion, then we can use the insert() function. It takes three arguments i.e.

  • The location of new column. We will pass 3 here.
  • The new column name. We will pass ‘Bonus’ here.
  • New column values. We will pass a list here.

It will add the list as a new column at index position 3 i.e. as the fourth column in the DataFrame. Let’s see the complete example,

listOfValues = [31000, 33000, 56500, 43100, 41000, 31333]

df.insert(loc=3, column='Bonus3', value=listOfValues)

print(df)

Output:

     Name Location    Team  Bonus3  Experience  Bonus1   Bonus2
0    Mark       US    Tech   31000           5   31000  31000.0
1    Riti    India    Tech   33000           7   33000  33000.0
2  Shanky    India     PMO   56500           2   56500  56500.0
3  Shreya    India  Design   43100           2   43100  43100.0
4    Aadi       US    Tech   41000          11   41000      NaN
5     Sim       US    Tech   31333           4   31333      NaN

Please make sure that the size of list is same as the number of rows in DataFrame, otherwise you will get an error. For example,

listOfValues = [31000, 33000, 56500, 43100]

df.insert(loc=3, column='Bonus4', value=listOfValues)

print(df)

It will give an error,

ValueError: Length of values (4) does not match length of index (6)

To Fix this, we need to make the list of similar size as number of rows in column. For missing values, we can add NaN in it. Let’s see an example,

listOfValues = [31000, 33000, 56500, 43100]

if len(listOfValues) < df.shape[0]:
    listOfValues += [np.NaN]*( df.shape[0] - len(listOfValues))

df.insert(loc=3, column='Bonus4', value=listOfValues)

print(df)

Output:

     Name Location    Team   Bonus4  Bonus3  Experience  Bonus1   Bonus2
0    Mark       US    Tech  31000.0   31000           5   31000  31000.0
1    Riti    India    Tech  33000.0   33000           7   33000  33000.0
2  Shanky    India     PMO  56500.0   56500           2   56500  56500.0
3  Shreya    India  Design  43100.0   43100           2   43100  43100.0
4    Aadi       US    Tech      NaN   41000          11   41000      NaN
5     Sim       US    Tech      NaN   31333           4   31333      NaN

Summary

We learned about different ways to add a list as a new column in the DataFrame in Pandas. Thanks.

Advertisements

Thanks for reading.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top