This article will discuss how to convert JSON to pandas Dataframe.
JSON stands for JavaScript Object Notation that stores the data in key-value pair format, inside the list/dictionary data structure. A DataFrame is a data structure that keeps the data in rows and columns.
We can convert JSON string or a JSON file to Pandas Dataframe using two techniques,
Table of Contents
- Convert JSON to Dataframe using read_json() method
- Convert JSON to Dataframe using json_normalize() method
Convert JSON to Dataframe using read_json() method
In Python, the Pandas module provides a method read_json() to convert JSON to a Dataframe. It can read the JSON contents from a file or use a JSON string directly and transform them into the dataframe.
Syntax is as follows
pandas.read_json('string/path/file_name', orient)
Parameters:
Frequently Asked:
- Add a column with current datetime in Pandas DataFrame
- How to compare two columns in a pandas DataFrame?
- Pandas : Convert a DataFrame into a list of rows or columns in python | (list of lists)
- Drop first row of pandas dataframe (3 Ways)
- First Parameter : ‘string/path/file_name’
- It accepts the JSON input as the first parameter. It can be in three formats,
- 1. A JSON string: It can convert JSON string into pandas dataframe.
- 2. A path to the JSON file: We can specify the JSON file name along with the path.
- 3. The JSON file’s name: If the JSON file is in the current directory, we can specify its name only.
- Second parameter: orient
- Specify the orientation of the JSON string i.e. the JSON string format.
Now let’s see how we can take JSON input as a string and convert it into a Pandas Dataframe.
Convert record-oriented JSON string into Pandas Dataframe
Here, we will use a JSON string that stores the data as a dictionary of values surrounded by list. For such type of data we will use the ‘records’ orientation.
Syntax:
pandas.read_json('string/path/file_name.json', orient='records')
Example:
In this example, we created the json string with three key-value pairs in 4 rows. Let’s convert this json string to pandas dataframe
import pandas as pd # Create json data with student details json_data = ''' [ { "id": "1", "name": "sravan","age":22 }, { "id": "2", "name": "harsha","age":22 }, { "id": "3", "name": "deepika","age":21 }, { "id": "4", "name": "jyothika","age":23 } ] ''' # Convert JSON string with records orient to a Dataframe df = pd.read_json(json_data, orient ='records') # Display the Dataframe print(df)
Output:
id name age 0 1 sravan 22 1 2 harsha 22 2 3 deepika 21 3 4 jyothika 23
Instead of JSON string as a first parameter, we can also provide the json file path. In that case it will read the JSON string from file and convert it into a Pandas Dataframe.
Convert index oriented JSON string into Pandas Dataframe
Here, we will consider the json string which contains the data in a dictionary of values surrounded by dictionary with key as index. For such type of data we will use the ‘index’ orientation.
Syntax:
pandas.read_json('string/path/file_name.json', orient='index')
Example:
In this example, we created the json string with three key-value pairs in 4 rows with each row specified by index.
Let’s convert this json string to pandas dataframe. we are specifying index as “student-1” to “student-4”
import pandas as pd # Create JSON data with student details json_data = ''' { "student-1": { "id": "1", "name": "sravan","age":22 }, "student-2":{ "id": "2", "name": "harsha","age":22 }, "student-3": { "id": "3", "name": "deepika","age":21 }, "student-4": { "id": "4", "name": "jyothika","age":23 } } ''' # Convert JSON file with index orient df = pd.read_json( json_data, orient ='index') # Display the Dataframe print(df)
Output:
id name age student-1 1 sravan 22 student-2 2 harsha 22 student-3 3 deepika 21 student-4 4 jyothika 23
Here , the indices will be the rows in the dataframe.
Convert columns oriented json string into Pandas Dataframe
Here, we will consider the json string which contains the data in a dictionary of values surrounded by dictionary with key as index. That index will be the columns in the dataframe. For such type of data we will use the ‘ columns ‘ orientation.
Syntax:
pandas.read_json('string/path/file_name.json', orient='columns')
Example:
In this example, we created the json string with three key-value pairs in 4 rows with each row specified by index.
Let’s convert this json string to pandas dataframe. we are specifying index as “student-1” to “student-4”
import pandas as pd # Create JSON data with student details json_data = ''' { "student-1": { "id": "1", "name": "sravan","age":22 }, "student-2":{ "id": "2", "name": "harsha","age":22 }, "student-3": { "id": "3", "name": "deepika","age":21 }, "student-4": { "id": "4", "name": "jyothika","age":23 } } ''' # Convert JSON file with columns orient df = pd.read_json( json_data, orient ='columns') # Display the Dataframe print(df)
Output:
student-1 student-2 student-3 student-4 id 1 2 3 4 name sravan harsha deepika jyothika age 22 22 21 23
Here , the column indices will be the columns in the dataframe
Convert values oriented JSON string into Pandas Dataframe
Here, we will consider the json string which contains the data in a list of values surrounded by a list. For such type of data we will use the ‘values ‘ orientation.
Syntax:
pandas.read_json('string/path/file_name.json', orient='values')
Example:
In this example, we created the json string with three key-value pairs in 4. Let’s convert this json string to pandas dataframe.
import pandas as pd # Create JSON data with student details json_data = ''' [ [ "1", "sravan",22 ], [ "2", "harsha",22 ], [ "3", "deepika",21 ], [ "4", "jyothika",23 ] ] ''' # convert JSON file with values orientation to Dataframe df = pd.read_json( json_data, orient ='values') # Display the Dataframe print(df)
Output:
0 1 2 0 1 sravan 22 1 2 harsha 22 2 3 deepika 21 3 4 jyothika 23
Here , the columns and rows by default starts with 0.
Convert JSON to Dataframe using json_normalize()
This method first normalize the json data and then converts into the pandas dataframe. We have to import this method from the pandas module.
Steps are as follows,
Step 1 – Load the json data
We can do this by using json.loads() function. Syntax is as follows,
json.loads(json_string)
Step 2 – Pass the loaded data into json_normalize() method
json_normalize(json.loads(json_string))
Example:
In this example, we will create 4 students details of json string and convert into pandas dataframe.
import pandas as pd import json # Create json string # with student details json_string = ''' [ { "id": "1", "name": "sravan","age":22 }, { "id": "2", "name": "harsha","age":22 }, { "id": "3", "name": "deepika","age":21 }, { "id": "4", "name": "jyothika","age":23 } ] ''' # Load json data and convert to Dataframe df = pd.json_normalize(json.loads(json_string)) # Display the Dataframe print(df)
Output:
id name age 0 1 sravan 22 1 2 harsha 22 2 3 deepika 21 3 4 jyothika 23
Here the row positions will starts with 0 by default.
Summary
In this article we used read_json() and json_normalize() methods to convert json to Pandas DataFrame.