In this article we will discuss how to convert a list of dictionaries to a Dataframe in pandas. Also we will cover following examples,
- Create Dataframe from list of dictionaries with default indexes
- Create Dataframe from list of dictionaries with custom indexes
- Create Dataframe from list of dictionaries with changed order of columns
- Create Dataframe from list of dictionaries with different columns
Suppose we have a list of python dictionaries i.e.
list_of_dict = [ {'Name': 'Shaun' , 'Age': 35, 'Marks': 91}, {'Name': 'Ritika', 'Age': 31, 'Marks': 87}, {'Name': 'Smriti', 'Age': 33, 'Marks': 78}, {'Name': 'Jacob' , 'Age': 23, 'Marks': 93}, ]
Each dictionary in the list has similar keys but different values. Now we want to convert this list of dictionaries to pandas Dataframe, in such a way that,
- All keys should be the column names i.e. for every key, there should be a separate column.
- Each column should contain the values associated with that key in all dictionaries
Final dataframe should be like this,
Name Age Marks a Shaun 35 91 b Ritika 31 87 c Smriti 33 78 d Jacob 23 93
We can achieve this using Dataframe constructor i.e.
pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
Apart from a dictionary of elements, the constructor can also accept a list of dictionaries from version 0.25 onwards. So we can directly create a dataframe from the list of dictionaries.
Let’s see how to do that,
Create Dataframe from list of dictionaries with default indexes
We can directly pass the list of dictionaries to the Dataframe constructor. It will return a Dataframe i.e.
import pandas as pd list_of_dict = [ {'Name': 'Shaun' , 'Age': 35, 'Marks': 91}, {'Name': 'Ritika', 'Age': 31, 'Marks': 87}, {'Name': 'Smriti', 'Age': 33, 'Marks': 78}, {'Name': 'Jacob' , 'Age': 23, 'Marks': 93}, ] # Create DataFrame from list of dictionaries df = pd.DataFrame(list_of_dict) print(df)
Output:
Name Age Marks 0 Shaun 35 91 1 Ritika 31 87 2 Smriti 33 78 3 Jacob 23 93
As all the dictionaries in the list had similar keys, so the keys became the column names. Then for each key, values of that key in all the dictionaries became the column values. As we didn’t provide any index argument, so dataframe has default indexes i.e. 0 to N-1.
But what if we want to have specific indexes too?
Create Dataframe from list of dicts with custom indexes
We can pass a list of indexes along with the list of dictionaries in the Dataframe constructor,
import pandas as pd list_of_dict = [ {'Name': 'Shaun' , 'Age': 35, 'Marks': 91}, {'Name': 'Ritika', 'Age': 31, 'Marks': 87}, {'Name': 'Smriti', 'Age': 33, 'Marks': 78}, {'Name': 'Jacob' , 'Age': 23, 'Marks': 93}, ] # Create Dataframe from list of dictionaries and # pass another list as index df = pd.DataFrame(list_of_dict, index=['a', 'b', 'c', 'd']) print(df)
Output:
Name Age Marks a Shaun 35 91 b Ritika 31 87 c Smriti 33 78 d Jacob 23 93
As all the dictionaries have similar keys, so the keys became the column names. Then for each key all the values associated with that key in all the dictionaries became the column values. Also, all the items from the index list were used as indexes in the dataframe.
Create Dataframe from list of dictionaries with changed order of columns
In all the previous examples, the order of columns in the generated Dataframe was the same as the order of keys in the dictionary. What if we want to have a different order of columns while creating Dataframe from list of dictionaries?
Let’s see how to do that,
import pandas as pd list_of_dict = [ {'Name': 'Shaun' , 'Age': 35, 'Marks': 91}, {'Name': 'Ritika', 'Age': 31, 'Marks': 87}, {'Name': 'Smriti', 'Age': 33, 'Marks': 78}, {'Name': 'Jacob' , 'Age': 23, 'Marks': 93}, ] # Create Dataframe from list of dictionaries and # pass another lists as index & columns df = pd.DataFrame(list_of_dict, index=['a', 'b', 'c', 'd'], columns=['Age', 'Marks', 'Name']) print(df)
Output:
Age Marks Name a 35 91 Shaun b 31 87 Ritika c 33 78 Smriti d 23 93 Jacob
We provided a separate list as columns argument in the Dataframe constructor, therefore the order of columns was based on that given list only. But what if someone provides an extra column name in the list or forgets to provide any column name in the list?
Create Dataframe from list of dictionaries with different columns
Example 1: Extra column
If we provide the column list as an argument to the Dataframe constructor along with the list of dictionaries and the list contains an entry for which there is no key in any of the dictionaries, then that column in Dataframe will contain only NaN values i.e.
import pandas as pd list_of_dict = [ {'Name': 'Shaun' , 'Age': 35, 'Marks': 91}, {'Name': 'Ritika', 'Age': 31, 'Marks': 87}, {'Name': 'Smriti', 'Age': 33, 'Marks': 78}, {'Name': 'Jacob' , 'Age': 23, 'Marks': 93}, ] # Create Dataframe from list of dictionaries and # pass an additional column df = pd.DataFrame(list_of_dict, index=['a', 'b', 'c', 'd'], columns=['Age', 'Marks', 'Name', 'Address']) print(df)
Output:
Age Marks Name Address a 35 91 Shaun NaN b 31 87 Ritika NaN c 33 78 Smriti NaN d 23 93 Jacob NaN
Example 1: Missing column
If we provide a less entry in the column names list then that column will be missing from the dataframe,
import pandas as pd list_of_dict = [ {'Name': 'Shaun' , 'Age': 35, 'Marks': 91}, {'Name': 'Ritika', 'Age': 31, 'Marks': 87}, {'Name': 'Smriti', 'Age': 33, 'Marks': 78}, {'Name': 'Jacob' , 'Age': 23, 'Marks': 93}, ] # Create Dataframe from list of dictionaries and # pass an additional column df = pd.DataFrame(list_of_dict, index=['a', 'b', 'c', 'd'], columns=['Name', 'Marks']) print(df)
Output:
Name Marks a Shaun 91 b Ritika 87 c Smriti 78 d Jacob 93
Here we passed a list of dictionaries as the first argument, but in columns argument we provided the names of all keys except one. Therefore Dataframe didn’t have any column for that particular key.
So, this is how we can convert a list of dictionaries to a Pandas Dataframe in python.