In this article, we will discuss different ways to add a new column from a list in Pandas DataFrame.
Table Of Contents
- Preparing DataSet
- Method 1: Using [] Opeartor
- Method 2: Using insert() function
- Summary
Preparing DataSet
First we will create a DataFrame from list of tuples i.e.
import pandas as pd import numpy as np # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'India', 'Tech' , 7), ('Shanky', 'India', 'PMO' , 2), ('Shreya', 'India', 'Design' , 2), ('Aadi', 'US', 'Tech', 11), ('Sim', 'US', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df)
Output:
Name Location Team Experience 0 Mark US Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi US Tech 11 5 Sim US Tech 4
Now we will learn about different ways to add a new column from a list in this DataFrame.
Method 1: Using [] Opeartor
Suppose we have list of values, and the size of list is equal to the number of rows in the DataFrame i.e.
[31000, 33000, 56500, 43100, 41000, 31333]
Now to add this list as a new column in DataFrame, we can directly assign this list to the df['New Column']
. Let’s see an example,
Frequently Asked:
listOfValues = [31000, 33000, 56500, 43100, 41000, 31333] df['Bonus1'] = listOfValues print(df)
Output:
Name Location Team Experience Bonus1 0 Mark US Tech 5 31000 1 Riti India Tech 7 33000 2 Shanky India PMO 2 56500 3 Shreya India Design 2 43100 4 Aadi US Tech 11 41000 5 Sim US Tech 4 31333
It added the list values as a new column ‘Bonus1’ in the Dataframe. But please make sure that the size of list is same as the number of rows in DataFrame, otherwise you will get an error. For example,
listOfValues = [31000, 33000, 56500, 43100] df['Bonus2'] = listOfValues print(df)
It will give an error,
ValueError: Length of values (4) does not match length of index (6)
To handle this kind of issue, we need to make the list of similar size as number of rows in column. For missing values, we can add NaN in it. Let’s see an example,
listOfValues = [31000, 33000, 56500, 43100] if len(listOfValues) < df.shape[0]: listOfValues += [np.NaN]*( df.shape[0] - len(listOfValues)) df['Bonus2'] = listOfValues print(df)
Output:
Name Location Team Experience Bonus1 Bonus2 0 Mark US Tech 5 31000 31000.0 1 Riti India Tech 7 33000 33000.0 2 Shanky India PMO 2 56500 56500.0 3 Shreya India Design 2 43100 43100.0 4 Aadi US Tech 11 41000 NaN 5 Sim US Tech 4 31333 NaN
It added a list as a column, and also used np.NaN
for missing values in list.
Method 2: Using insert() function
If we want to add list as a column at a specific poistion, then we can use the insert() function. It takes three arguments i.e.
- The location of new column. We will pass 3 here.
- The new column name. We will pass ‘Bonus’ here.
- New column values. We will pass a list here.
It will add the list as a new column at index position 3 i.e. as the fourth column in the DataFrame. Let’s see the complete example,
listOfValues = [31000, 33000, 56500, 43100, 41000, 31333] df.insert(loc=3, column='Bonus3', value=listOfValues) print(df)
Output:
Name Location Team Bonus3 Experience Bonus1 Bonus2 0 Mark US Tech 31000 5 31000 31000.0 1 Riti India Tech 33000 7 33000 33000.0 2 Shanky India PMO 56500 2 56500 56500.0 3 Shreya India Design 43100 2 43100 43100.0 4 Aadi US Tech 41000 11 41000 NaN 5 Sim US Tech 31333 4 31333 NaN
Please make sure that the size of list is same as the number of rows in DataFrame, otherwise you will get an error. For example,
listOfValues = [31000, 33000, 56500, 43100] df.insert(loc=3, column='Bonus4', value=listOfValues) print(df)
It will give an error,
ValueError: Length of values (4) does not match length of index (6)
To Fix this, we need to make the list of similar size as number of rows in column. For missing values, we can add NaN in it. Let’s see an example,
listOfValues = [31000, 33000, 56500, 43100] if len(listOfValues) < df.shape[0]: listOfValues += [np.NaN]*( df.shape[0] - len(listOfValues)) df.insert(loc=3, column='Bonus4', value=listOfValues) print(df)
Output:
Name Location Team Bonus4 Bonus3 Experience Bonus1 Bonus2 0 Mark US Tech 31000.0 31000 5 31000 31000.0 1 Riti India Tech 33000.0 33000 7 33000 33000.0 2 Shanky India PMO 56500.0 56500 2 56500 56500.0 3 Shreya India Design 43100.0 43100 2 43100 43100.0 4 Aadi US Tech NaN 41000 11 41000 NaN 5 Sim US Tech NaN 31333 4 31333 NaN
Summary
We learned about different ways to add a list as a new column in the DataFrame in Pandas. Thanks.