Convert JSON to a Pandas Dataframe

This article will discuss how to convert JSON to pandas Dataframe.

JSON stands for JavaScript Object Notation that stores the data in key-value pair format, inside the list/dictionary data structure. A DataFrame is a data structure that keeps the data in rows and columns.

We can convert JSON string or a JSON file to Pandas Dataframe using two techniques,

Table of Contents

Convert JSON to Dataframe using read_json() method

In Python, the Pandas module provides a method read_json() to convert JSON to a Dataframe. It can read the JSON contents from a file or use a JSON string directly and transform them into the dataframe.

Advertisements

Syntax is as follows

pandas.read_json('string/path/file_name', orient)

Parameters:

  • First Parameter : ‘string/path/file_name’
    • It accepts the JSON input as the first parameter. It can be in three formats,
    • 1. A JSON string: It can convert JSON string into pandas dataframe.
    • 2. A path to the JSON file: We can specify the JSON file name along with the path.
    • 3. The JSON file’s name: If the JSON file is in the current directory, we can specify its name only.
  • Second parameter: orient
    • Specify the orientation of the JSON string i.e. the JSON string format.

Now let’s see how we can take JSON input as a string and convert it into a Pandas Dataframe.

Convert record-oriented JSON string into Pandas Dataframe

Here, we will use a JSON string that stores the data as a dictionary of values surrounded by list. For such type of data we will use the ‘records’ orientation.

Syntax:

pandas.read_json('string/path/file_name.json', orient='records')

Example:

In this example, we created the json string with three key-value pairs in 4 rows. Let’s convert this json string to pandas dataframe

import pandas as pd

# Create json data with student details
json_data = '''
[
    { "id": "1", "name": "sravan","age":22 },
    { "id": "2", "name": "harsha","age":22 },
    { "id": "3", "name": "deepika","age":21 },
    { "id": "4", "name": "jyothika","age":23 }
] '''

# Convert JSON string with records orient to a Dataframe
df = pd.read_json(json_data, orient ='records')

# Display the Dataframe
print(df)

Output:

   id      name  age
0   1    sravan   22
1   2    harsha   22
2   3   deepika   21
3   4  jyothika   23

Instead of JSON string as a first parameter, we can also provide the json file path. In that case it will read the JSON string from file and convert it into a Pandas Dataframe.

Convert index oriented JSON string into Pandas Dataframe

Here, we will consider the json string which contains the data in a dictionary of values surrounded by dictionary with key as index. For such type of data we will use the ‘index’ orientation.

Syntax:

pandas.read_json('string/path/file_name.json', orient='index')

Example:

In this example, we created the json string with three key-value pairs in 4 rows with each row specified by index.
Let’s convert this json string to pandas dataframe. we are specifying index as “student-1” to “student-4”

import pandas as pd

# Create JSON data with student details
json_data = '''
{
    "student-1": { "id": "1", "name": "sravan","age":22 },
    "student-2":{ "id": "2", "name": "harsha","age":22 },
    "student-3": { "id": "3", "name": "deepika","age":21 },
    "student-4": { "id": "4", "name": "jyothika","age":23 }
} '''

# Convert JSON file with index orient
df = pd.read_json(  json_data,
                    orient ='index')

# Display the Dataframe
print(df)

Output:

           id      name  age
student-1   1    sravan   22
student-2   2    harsha   22
student-3   3   deepika   21
student-4   4  jyothika   23

Here , the indices will be the rows in the dataframe.

Convert columns oriented json string into Pandas Dataframe

Here, we will consider the json string which contains the data in a dictionary of values surrounded by dictionary with key as index. That index will be the columns in the dataframe. For such type of data we will use the ‘ columns ‘ orientation.

Syntax:

pandas.read_json('string/path/file_name.json', orient='columns')

Example:

In this example, we created the json string with three key-value pairs in 4 rows with each row specified by index.
Let’s convert this json string to pandas dataframe. we are specifying index as “student-1” to “student-4”

import pandas as pd

# Create JSON data with student details
json_data = '''
{
    "student-1": { "id": "1", "name": "sravan","age":22 },
    "student-2":{ "id": "2", "name": "harsha","age":22 },
    "student-3": { "id": "3", "name": "deepika","age":21 },
    "student-4": { "id": "4", "name": "jyothika","age":23 }
}
'''

# Convert JSON file with columns orient
df = pd.read_json(  json_data,
                    orient ='columns')

# Display the Dataframe
print(df)

Output:

     student-1 student-2 student-3 student-4
id           1         2         3         4
name    sravan    harsha   deepika  jyothika
age         22        22        21        23

Here , the column indices will be the columns in the dataframe

Convert values oriented JSON string into Pandas Dataframe

Here, we will consider the json string which contains the data in a list of values surrounded by a list. For such type of data we will use the ‘values ‘ orientation.

Syntax:

pandas.read_json('string/path/file_name.json', orient='values')

Example:

In this example, we created the json string with three key-value pairs in 4. Let’s convert this json string to pandas dataframe.

import pandas as pd

# Create JSON data with student details
json_data = '''
[
    [ "1", "sravan",22 ],
    [ "2", "harsha",22 ],
    [ "3", "deepika",21 ],
    [ "4", "jyothika",23 ]
]
'''

# convert JSON file with values orientation to Dataframe
df = pd.read_json(  json_data,
                    orient ='values')

# Display the Dataframe
print(df)

Output:

   0         1   2
0  1    sravan  22
1  2    harsha  22
2  3   deepika  21
3  4  jyothika  23

Here , the columns and rows by default starts with 0.

Convert JSON to Dataframe using json_normalize()

This method first normalize the json data and then converts into the pandas dataframe. We have to import this method from the pandas module.

Steps are as follows,

Step 1 – Load the json data

We can do this by using json.loads() function. Syntax is as follows,

json.loads(json_string)

Step 2 – Pass the loaded data into json_normalize() method

json_normalize(json.loads(json_string))

Example:

In this example, we will create 4 students details of json string and convert into pandas dataframe.

import pandas as pd
import json

# Create json string
# with student details
json_string = '''
[
    { "id": "1", "name": "sravan","age":22 },
    { "id": "2", "name": "harsha","age":22 },
    { "id": "3", "name": "deepika","age":21 },
    { "id": "4", "name": "jyothika","age":23 }
]
'''

# Load json data and convert to Dataframe 
df = pd.json_normalize(json.loads(json_string)) 

# Display the Dataframe
print(df)

Output:

  id      name  age
0  1    sravan   22
1  2    harsha   22
2  3   deepika   21
3  4  jyothika   23

Here the row positions will starts with 0 by default.

Summary

In this article we used read_json() and json_normalize() methods to convert json to Pandas DataFrame.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top