Add Column with random values in Pandas DataFrame

In this article, we will discuss different ways to add a column with random values in a DataFrame in Pandas.

Table Of Contents

Preparing DataSet
Method 1: Using np.random.randint()
Method 2: Using np.random.choice()
Method 3: Using random.sample()
Summary

Preparing DataSet

First we will create a DataFrame from list of tuples i.e.

import pandas as pd
import numpy as np

# List of Tuples
employees= [('Mark', 'US', 'Tech',   5),
            ('Riti', 'India', 'Tech' ,   7),
            ('Shanky', 'India', 'PMO' ,   2),
            ('Shreya', 'India', 'Design' ,   2),
            ('Aadi', 'US', 'Tech', 11),
            ('Sim', 'US', 'Tech', 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Location', 'Team', 'Experience'])
print(df)

Output:

     Name Location    Team  Experience
0    Mark       US    Tech           5
1    Riti    India    Tech           7
2  Shanky    India     PMO           2
3  Shreya    India  Design           2
4    Aadi       US    Tech          11
5     Sim       US    Tech           4

Now we will learn about different ways to add a column in this DataFrame with random values.

Method 1: Using np.random.randint()

The random.randint() function from numpy module, returns random integers from low to high i.e.

np.random.randint(30000, 40000, df.shape[0])

It will return N random integers from 30000 to 40000, where N is the number of rows in DataFrame. So, to add a column ‘Bonus1’ with random values, we need to assign this returned integers to the new column i.e.

Frequently Asked:

# Add a new column 'Bonus1' with random numbers from 30000 to 40000
df['Bonus1'] = np.random.randint(30000, 40000, df.shape[0])

print(df)

Output:

     Name Location    Team  Experience  Bonus1
0    Mark       US    Tech           5   33398
1    Riti    India    Tech           7   30979
2  Shanky    India     PMO           2   32430
3  Shreya    India  Design           2   32870
4    Aadi       US    Tech          11   39744
5     Sim       US    Tech           4   36133

It added a new column ‘Bonus1’ with random integers.

Method 2: Using np.random.choice()

The numpy module provides a function choice(). It accepts a 1-D array and an integer N as arguments, and returns N random samples from the given 1-D array. We can assign that to a new column in the DataFrame. Let’s see an example,

# Add a new column 'Bonus1' with random numbers from three given numbers
df['Bonus2'] = np.random.choice([31111, 32222, 33333], df.shape[0])

print(df)

Output:

     Name Location    Team  Experience  Bonus1  Bonus2
0    Mark       US    Tech           5   36488   33333
1    Riti    India    Tech           7   32159   31111
2  Shanky    India     PMO           2   38785   31111
3  Shreya    India  Design           2   32592   33333
4    Aadi       US    Tech          11   34777   32222
5     Sim       US    Tech           4   32469   33333

It added a new column ‘Bonus2’ with random integers.

Method 3: Using random.sample()

Instead of going after numpy module for random values, we can also use the random package from python. It has a function sample(), which accepts a range and a number N as arguments, and returns N random values from the given range. we can assign these random values to a new column. Let’s see an example,

import random

# Add a new column 'Bonus3' with random numbers between 30000 to 40000
df['Bonus3'] =  random.sample(range(30000, 40000), df.shape[0])

print(df)

Output:

     Name Location    Team  Experience  Bonus1  Bonus2  Bonus3
0    Mark       US    Tech           5   35663   32222   33204
1    Riti    India    Tech           7   32674   31111   32618
2  Shanky    India     PMO           2   39335   33333   38976
3  Shreya    India  Design           2   31983   32222   32764
4    Aadi       US    Tech          11   38834   33333   32189
5     Sim       US    Tech           4   31147   32222   31582

It added a new column ‘Bonus3’ with random integers.

Summary

We learned about three different ways to add a column with random values in DataFrame. Thanks.

Preparing DataSet

Method 1: Using np.random.randint()

Frequently Asked:

Method 2: Using np.random.choice()

Method 3: Using random.sample()

Summary

Related posts:

Share your love

Leave a Comment Cancel Reply