In this article, we will discuss different ways to add a column with random values in a DataFrame in Pandas.
Table Of Contents
Preparing DataSet
First we will create a DataFrame from list of tuples i.e.
import pandas as pd import numpy as np # List of Tuples employees= [('Mark', 'US', 'Tech', 5), ('Riti', 'India', 'Tech' , 7), ('Shanky', 'India', 'PMO' , 2), ('Shreya', 'India', 'Design' , 2), ('Aadi', 'US', 'Tech', 11), ('Sim', 'US', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Location', 'Team', 'Experience']) print(df)
Output:
Name Location Team Experience 0 Mark US Tech 5 1 Riti India Tech 7 2 Shanky India PMO 2 3 Shreya India Design 2 4 Aadi US Tech 11 5 Sim US Tech 4
Now we will learn about different ways to add a column in this DataFrame with random values.
Frequently Asked:
Method 1: Using np.random.randint()
The random.randint() function from numpy module, returns random integers from low to high i.e.
np.random.randint(30000, 40000, df.shape[0])
It will return N random integers from 30000 to 40000, where N is the number of rows in DataFrame. So, to add a column ‘Bonus1’ with random values, we need to assign this returned integers to the new column i.e.
# Add a new column 'Bonus1' with random numbers from 30000 to 40000 df['Bonus1'] = np.random.randint(30000, 40000, df.shape[0]) print(df)
Output:
Name Location Team Experience Bonus1 0 Mark US Tech 5 33398 1 Riti India Tech 7 30979 2 Shanky India PMO 2 32430 3 Shreya India Design 2 32870 4 Aadi US Tech 11 39744 5 Sim US Tech 4 36133
It added a new column ‘Bonus1’ with random integers.
Latest Python - Video Tutorial
Method 2: Using np.random.choice()
The numpy module provides a function choice(). It accepts a 1-D array and an integer N as arguments, and returns N random samples from the given 1-D array. We can assign that to a new column in the DataFrame. Let’s see an example,
# Add a new column 'Bonus1' with random numbers from three given numbers df['Bonus2'] = np.random.choice([31111, 32222, 33333], df.shape[0]) print(df)
Output:
Name Location Team Experience Bonus1 Bonus2 0 Mark US Tech 5 36488 33333 1 Riti India Tech 7 32159 31111 2 Shanky India PMO 2 38785 31111 3 Shreya India Design 2 32592 33333 4 Aadi US Tech 11 34777 32222 5 Sim US Tech 4 32469 33333
It added a new column ‘Bonus2’ with random integers.
Method 3: Using random.sample()
Instead of going after numpy module for random values, we can also use the random
package from python. It has a function sample(), which accepts a range and a number N as arguments, and returns N random values from the given range. we can assign these random values to a new column. Let’s see an example,
import random # Add a new column 'Bonus3' with random numbers between 30000 to 40000 df['Bonus3'] = random.sample(range(30000, 40000), df.shape[0]) print(df)
Output:
Name Location Team Experience Bonus1 Bonus2 Bonus3 0 Mark US Tech 5 35663 32222 33204 1 Riti India Tech 7 32674 31111 32618 2 Shanky India PMO 2 39335 33333 38976 3 Shreya India Design 2 31983 32222 32764 4 Aadi US Tech 11 38834 33333 32189 5 Sim US Tech 4 31147 32222 31582
It added a new column ‘Bonus3’ with random integers.
Summary
We learned about three different ways to add a column with random values in DataFrame. Thanks.
Latest Video Tutorials