Pandas: Create Dataframe from list of dictionaries

In this article we will discuss how to convert a list of dictionaries to a Dataframe in pandas. Also we will cover following examples,

  • Create Dataframe from list of dictionaries with default indexes
  • Create Dataframe from list of dictionaries with custom indexes
  • Create Dataframe from list of dictionaries with changed order of columns
  • Create Dataframe from list of dictionaries with different columns

Suppose we have a list of python dictionaries i.e.

list_of_dict = [
    {'Name': 'Shaun' ,  'Age': 35,  'Marks': 91},
    {'Name': 'Ritika',  'Age': 31,  'Marks': 87},
    {'Name': 'Smriti',  'Age': 33,  'Marks': 78},
    {'Name': 'Jacob' ,  'Age': 23,  'Marks': 93},
]

Each dictionary in the list has similar keys but different values. Now we want to convert this list of dictionaries to pandas Dataframe, in such a way that,

  • All keys should be the column names i.e. for every key, there should be a separate column.
  • Each column should contain the values associated with that key in all dictionaries

Final dataframe should be like this,

     Name  Age  Marks
a   Shaun   35     91
b  Ritika   31     87
c  Smriti   33     78
d   Jacob   23     93

We can achieve this using Dataframe constructor i.e.

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Apart from a dictionary of elements, the constructor can also accept a list of dictionaries from version 0.25 onwards. So we can directly create a dataframe from the list of dictionaries.

Let’s see how to do that,

Create Dataframe from list of dictionaries with default indexes

We can directly pass the list of dictionaries to the Dataframe constructor. It will return a Dataframe i.e.

import pandas as pd

list_of_dict = [
    {'Name': 'Shaun' ,  'Age': 35,  'Marks': 91},
    {'Name': 'Ritika',  'Age': 31,  'Marks': 87},
    {'Name': 'Smriti',  'Age': 33,  'Marks': 78},
    {'Name': 'Jacob' ,  'Age': 23,  'Marks': 93},
]

# Create DataFrame from list of dictionaries
df = pd.DataFrame(list_of_dict)

print(df)

Output:

     Name  Age  Marks
0   Shaun   35     91
1  Ritika   31     87
2  Smriti   33     78
3   Jacob   23     93

As all the dictionaries in the list had similar keys, so the keys became the column names. Then for each key, values of that key in all the dictionaries became the column values. As we didn’t provide any index argument, so dataframe has default indexes i.e. 0 to N-1.

But what if we want to have specific indexes too?

Create Dataframe from list of dicts with custom indexes

We can pass a list of indexes along with the list of dictionaries in the Dataframe constructor,

import pandas as pd

list_of_dict = [
    {'Name': 'Shaun' ,  'Age': 35,  'Marks': 91},
    {'Name': 'Ritika',  'Age': 31,  'Marks': 87},
    {'Name': 'Smriti',  'Age': 33,  'Marks': 78},
    {'Name': 'Jacob' ,  'Age': 23,  'Marks': 93},
]

# Create Dataframe from list of dictionaries and
# pass another list as index
df = pd.DataFrame(list_of_dict,
                  index=['a', 'b', 'c', 'd'])

print(df)

Output:

     Name  Age  Marks
a   Shaun   35     91
b  Ritika   31     87
c  Smriti   33     78
d   Jacob   23     93

As all the dictionaries have similar keys, so the keys became the column names. Then for each key all the values associated with that key in all the dictionaries became the column values. Also, all the items from the index list were used as indexes in the dataframe.

Create Dataframe from list of dictionaries with changed order of columns

In all the previous examples, the order of columns in the generated Dataframe was the same as the order of keys in the dictionary. What if we want to have a different order of columns while creating Dataframe from list of dictionaries?

Let’s see how to do that,

import pandas as pd

list_of_dict = [
    {'Name': 'Shaun' ,  'Age': 35,  'Marks': 91},
    {'Name': 'Ritika',  'Age': 31,  'Marks': 87},
    {'Name': 'Smriti',  'Age': 33,  'Marks': 78},
    {'Name': 'Jacob' ,  'Age': 23,  'Marks': 93},
]
# Create Dataframe from list of dictionaries and
# pass another lists as index & columns
df = pd.DataFrame(list_of_dict,
                  index=['a', 'b', 'c', 'd'],
                  columns=['Age', 'Marks', 'Name'])

print(df)

Output:

   Age  Marks    Name
a   35     91   Shaun
b   31     87  Ritika
c   33     78  Smriti
d   23     93   Jacob

We provided a separate list as columns argument in the Dataframe constructor, therefore the order of columns was based on that given list only. But what if someone provides an extra column name in the list or forgets to provide any column name in the list?

Create Dataframe from list of dictionaries with different columns

Example 1: Extra column

If we provide the column list as an argument to the Dataframe constructor along with the list of dictionaries and the list contains an entry for which there is no key in any of the dictionaries, then that column in Dataframe will contain only NaN values i.e.

import pandas as pd

list_of_dict = [
    {'Name': 'Shaun' ,  'Age': 35,  'Marks': 91},
    {'Name': 'Ritika',  'Age': 31,  'Marks': 87},
    {'Name': 'Smriti',  'Age': 33,  'Marks': 78},
    {'Name': 'Jacob' ,  'Age': 23,  'Marks': 93},
]

# Create Dataframe from list of dictionaries and
# pass an additional column
df = pd.DataFrame(list_of_dict,
                  index=['a', 'b', 'c', 'd'],
                  columns=['Age', 'Marks', 'Name', 'Address'])

print(df)

Output:

   Age  Marks    Name  Address
a   35     91   Shaun      NaN
b   31     87  Ritika      NaN
c   33     78  Smriti      NaN
d   23     93   Jacob      NaN

Example 1: Missing column

If we provide a less entry in the column names list then that column will be missing from the dataframe,

import pandas as pd

list_of_dict = [
    {'Name': 'Shaun' ,  'Age': 35,  'Marks': 91},
    {'Name': 'Ritika',  'Age': 31,  'Marks': 87},
    {'Name': 'Smriti',  'Age': 33,  'Marks': 78},
    {'Name': 'Jacob' ,  'Age': 23,  'Marks': 93},
]

# Create Dataframe from list of dictionaries and
# pass an additional column
df = pd.DataFrame(list_of_dict,
                  index=['a', 'b', 'c', 'd'],
                  columns=['Name', 'Marks'])

print(df)

Output:

     Name  Marks
a   Shaun     91
b  Ritika     87
c  Smriti     78
d   Jacob     93

Here we passed a list of dictionaries as the first argument, but in columns argument we provided the names of all keys except one. Therefore Dataframe didn’t have any column for that particular key.

So, this is how we can convert a list of dictionaries to a Pandas Dataframe in python.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top