Create Pandas Dataframe with Random Numbers

In this article we will discuss how to create Dataframe of random integers or floats.

Table Of Contents

Create Dataframe with Random Integers using randint()

The numpy module provides several random number routines and one of them is randint(). It gives a numpy array of random numbers in the given range. We can also specify the dimension of random numpy array i.e. it can be 1D, 2D or 3D etc. We can create a numpy array of random numbers using it and use it to create a Dataframe or random numbers. Let’s first learn more about numpy.random.randint().

Syntax of numpy.random.randint():

numpy.random.randint(start , stop=None, size=None, dtype=int)

where,
1. start is the lowest integer to be drawn in the range.
2. stop is the highest integer to be drawn in the range.
3. size specifies the shape of numpy array. By default it is one dimensional.

Advertisements

Create DataFrame with one column of random numbers

Generate a one dimensional numpy array of random numbers using randint(). Then create a single column dataframe, use this numpy array to populate values in the column. Let’s understand this with an example,

Example: In this example, we are going to create a numpy array of 5 random integers in the range of 10 – 25. Then populate those values in the Dataframe Column.

import pandas as pd
import numpy as np

# Create 5 random integers in the range of 10 - 25
random_data = np.random.randint(10, 25, size=5)

# Create Datfarme with single column of random values 
df = pd.DataFrame(random_data, columns=['RANDOM VALUES'])

# Display the Dataframe
print(df)

Output:

   RANDOM VALUES
0             20
1             13
2             24
3             17
4             19

Here we created a Dataframe with only one column named ‘RANDOM VALUES’.

Create DataFrame with multiple columns of Random Numbers

We can generate a 2D numpy array of random numbers using numpy.random.randint() and the pass it to pandas.Dataframe() to create a multiple Dataframe of random values.

Let’s see an example where we will first create a 2D NumPy Array of random values. This 2D array has five rows and three columns,

import numpy as np

# Create 2D Numpy array of 5 rows and 3 columns,
# filled with random values from 10 to 25 
random_data = np.random.randint(10,25,size=(5,3))

Then use this NumPy Array of random values to create a Dataframe of five rows and three columns,

import pandas as pd

# Create a Dataframe with random values
# using 2D numpy Array
df = pd.DataFrame(random_data, columns=['Column_1','Column_2','Column_3'])

Checkout the complete example,

import pandas as pd
import numpy as np

# Create 2D Numpy array of 5 rows and 3 columns,
# filled with random values from 10 to 25 
random_data = np.random.randint(10,25,size=(5,3))

# Create a Dataframe with random values
# using 2D numpy Array
df = pd.DataFrame(random_data, columns=['Column_1','Column_2','Column_3'])

# Display the Dataframe
print(df)

Output:

   Column_1  Column_2  Column_3
0        16        15        20
1        19        20        24
2        20        20        13
3        11        16        18
4        16        17        20

Here we create a Dataframe filled with random integers.

Create DataFrame of random numbers and convert values to string

Just like previous solutions, we can create a Dataframe of random integers using randint() and then convert data types of all values in all columns to string i.e.

import pandas as pd
import numpy as np

# Create 2D Numpy array of 5 rows and 3 columns,
# filled with random values from 10 to 25 
random_data = np.random.randint(10,25,size=(5,3))

# Create a Dataframe with random values
# using 2D numpy Array
df = pd.DataFrame(random_data, columns=['Column_1','Column_2','Column_3'])
df = df.astype(str)

# Display the Dataframe
print(df)

print('Data types of all columns: ')
print(df.dtypes)

Output:

  Column_1 Column_2 Column_3
0       12       11       20
1       21       10       11
2       24       15       12
3       20       17       20
4       13       24       19

Data types of all columns: 

Column_1    object
Column_2    object
Column_3    object
dtype: object

Create Pandas Dataframe with Random float values

Use the np.random.rand() to create a 2D numpy Array filled with random numbers from 0 to 1. But suppose you want to generate random values from 10 to 20. Then in that case you need to multiple all values in numpy array by 10 and add 10 to it i.e.

import numpy as np

# Create 2D Numpy array of 5 rows and 3 columns,
# filled with random values from 0 to 1
random_data = np.random.rand(5,3)

# Create Numpy Array with random floats from 10 to 20
random_data = 10 + random_data*10

print(random_data)

Use this 2D Numpy array to generate a Dataframe of random float values,

import pandas as pd
import numpy as np

# Create 2D Numpy array of 5 rows and 3 columns,
# filled with random values from 0 to 1
random_data = np.random.rand(5,3)

# Create Numpy Array with random floats from 10 to 20
random_data = 10 + random_data*10

# Create a Dataframe with random values
# using 2D numpy Array
df = pd.DataFrame(random_data, columns=['Column_1','Column_2','Column_3'])

# Display the Dataframe
print(df)

Output:

    Column_1   Column_2   Column_3
0  14.240746  18.295825  19.396178
1  12.223251  11.730770  12.090752
2  18.435215  17.188767  13.710343
3  17.358443  16.031840  15.464308
4  12.985251  13.042926  16.485127

Summary

In this article we learned how to create dataframe with random integers or floats using the numpy module’s random routines.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top