In this article, we will discuss how to convert a single or multiple lists to a DataFrame.
Table Of Contents
- Convert list of lists to DataFrame in Pandas
- Convert Lists of tuples to DataFrame in Pandas
- Convert List of lists to DataFrame & set column names and indexes
- Convert List of tuples to DataFrame and skip certain columns
- Convert multiple lists to DataFrame in Pandas
Python’s pandas library provide a constructor of DataFrame to create a Dataframe by passing objects i.e.
pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
Here data parameter can be a numpy ndarray, lists, dict, or an other DataFrame. Also, columns and index are for column and index labels. Let’s use this to convert lists to dataframe object from lists.
Convert list of lists to DataFrame in Pandas
Suppose we have a list of lists i.e.
# List of lists students = [ ['jack', 34, 'Sydeny'] , ['Riti', 30, 'Delhi' ] , ['Aadi', 16, 'New York'] ]
Pass this list to DataFrame’s constructor to create a dataframe object i.e.
import pandas as pd # Creating a DataFrame object from list of lists dfObj = pd.DataFrame(students) # Display the DataFrame print(dfObj)
Contents of the created DataFrames are as follows,
0 1 2 0 jack 34 Sydeny 1 Riti 30 Delhi 2 Aadi 16 New York
Convert Lists of tuples to DataFrame in Pandas
Just like list of lists we can pass list of tuples in dataframe constructor to create a dataframe.
Suppose we have a list of tuples i.e.
# List of Tuples students = [ ('jack', 34, 'Sydeny') , ('Riti', 30, 'Delhi' ) , ('Aadi', 16, 'New York') ]
Pass this list of tuples to DataFrame’s constructor to create a DataFrame object i.e.
import pandas as pd # Creating a DataFrame object from list of tuple dfObj = pd.DataFrame(students) # Display the DataFrame print(dfObj)
Contents of the created dataframe is as follows,
0 1 2 0 jack 34 Sydeny 1 Riti 30 Delhi 2 Aadi 16 New York
Both Column & Index labels are default. But we can also provide them i.e.
Convert List of lists to DataFrame & set column names and indexes
import pandas as pd # List of lists students = [ ['jack', 34, 'Sydeny'] , ['Riti', 30, 'Delhi' ] , ['Aadi', 16, 'New York'] ] # Convert list of tuples to dataframe and # set column names and indexes dfObj = pd.DataFrame(students, columns = ['Name' , 'Age', 'City'], index=['a', 'b', 'c']) # Display the DataFrame print(dfObj)
Contents of the created dataframe is as follows,
Name Age City a jack 34 Sydeny b Riti 30 Delhi c Aadi 16 New York
Convert List of tuples to DataFrame and skip certain columns
What of in our list of tuples we have 3 entries in each tuple. What if we want to use 1st and 3rd entry only? Let’s create a dataframe by skipping 2nd entry in tuples i.e.
import pandas as pd # List of Tuples students = [ ('jack', 34, 'Sydeny') , ('Riti', 30, 'Delhi' ) , ('Aadi', 16, 'New York') ] # Create datafrae from student list of tuples # but skip column 'Age' i.e. only with 2 columns dfObj = pd.DataFrame.from_records( students, exclude=['Age'], columns = ['Name' , 'Age', 'City'], index=['a', 'b', 'c']) # Display the DataFrame print(dfObj)
Contents of the created dataframe is as follows,
Name City a jack Sydeny b Riti Delhi c Aadi New York
This DataFrame has only two columns because we skipped the middle entry from each of the tuple in list.
Convert multiple lists to DataFrame in Pandas
Suppose we have 3 different lists and we want to convert them to a DataFrame, with each list as a column. To do that,
zip the lists to create a list of tuples and create a dataframe with this zipped lists i.e.
import pandas as pd listOfNames = ['Jack', 'Riti', 'Aadi'] listOfAge = [34, 30, 16] listOfCity = ['Sydney', 'Delhi', 'New york'] # Create a zipped list of tuples from above lists zippedList = list(zip(listOfNames, listOfAge, listOfCity)) # Create a dataframe from zipped list dfObj = pd.DataFrame(zippedList, columns = ['Name' , 'Age', 'City'], index=['a', 'b', 'c']) # Display the DataFrame print(dfObj)
Contents of the created dataframe is as follows,
Name Age City a Jack 34 Sydney b Riti 30 Delhi c Aadi 16 New york
The Complete example is as follows,
import pandas as pd students = [['jack', 34, 'Sydeny'] , ['Riti', 30, 'Delhi' ] , ['Aadi', 16, 'New York'] ] print("****Create a Dataframe from list of lists *****") # Creating a dataframe object from listoftuples dfObj = pd.DataFrame(students) print("Dataframe : " , dfObj, sep='\n') # List of Tuples students = [('jack', 34, 'Sydeny') , ('Riti', 30, 'Delhi' ) , ('Aadi', 16, 'New York') ] print("****Create a Dataframe from list of tuple *****") # Creating a dataframe object from listoftuples dfObj = pd.DataFrame(students) print("Dataframe : " , dfObj, sep='\n') print("****Create a Dataframe from list of tuple, also set column names and indexes *****") #Convert list of tuples to dataframe and set column names and indexes dfObj = pd.DataFrame(students, columns = ['Name' , 'Age', 'City'], index=['a', 'b', 'c']) print("Dataframe : " , dfObj, sep='\n') print("****Create dataframe from list of tuples and skip certain columns*********") # Create datafrae from student list but # skip column 'Age' i.e. only with 2 columns dfObj = pd.DataFrame.from_records( students, exclude=['Age'], columns = ['Name' , 'Age', 'City'], index=['a', 'b', 'c']) print("Dataframe : " , dfObj, sep='\n') print("***Create dataframe from multiple lists***") listOfNames = ['jack', 'Riti', 'Aadi'] listOfAge = [34, 30, 16] listOfCity = ['Sydney', 'Delhi', 'New york'] # Create a zipped list of tuples from above lists zippedList = list(zip(listOfNames, listOfAge, listOfCity)) print("zippedList = " , zippedList) # Create a dataframe from zipped list dfObj = pd.DataFrame(zippedList, columns = ['Name' , 'Age', 'City'], index=['a', 'b', 'c']) print("Dataframe : " , dfObj, sep='\n')
Output:
****Create a Dataframe from list of lists ***** Dataframe : 0 1 2 0 jack 34 Sydeny 1 Riti 30 Delhi 2 Aadi 16 New York ****Create a Dataframe from list of tuple ***** Dataframe : 0 1 2 0 jack 34 Sydeny 1 Riti 30 Delhi 2 Aadi 16 New York ****Create a Dataframe from list of tuple, also set column names and indexes ***** Dataframe : Name Age City a jack 34 Sydeny b Riti 30 Delhi c Aadi 16 New York ****Create dataframe from list of tuples and skip certain columns********* Dataframe : Name City a jack Sydeny b Riti Delhi c Aadi New York ***Create dataframe from multiple lists*** zippedList = [('jack', 34, 'Sydney'), ('Riti', 30, 'Delhi'), ('Aadi', 16, 'New york')] Dataframe : Name Age City a jack 34 Sydney b Riti 30 Delhi c Aadi 16 New york
Summary:
We learned about different ways to convert list to a Pandas DataFrame in Python.
Pandas Tutorials -Learn Data Analysis with Python
-
Pandas Tutorial Part #1 - Introduction to Data Analysis with Python
-
Pandas Tutorial Part #2 - Basics of Pandas Series
-
Pandas Tutorial Part #3 - Get & Set Series values
-
Pandas Tutorial Part #4 - Attributes & methods of Pandas Series
-
Pandas Tutorial Part #5 - Add or Remove Pandas Series elements
-
Pandas Tutorial Part #6 - Introduction to DataFrame
-
Pandas Tutorial Part #7 - DataFrame.loc[] - Select Rows / Columns by Indexing
-
Pandas Tutorial Part #8 - DataFrame.iloc[] - Select Rows / Columns by Label Names
-
Pandas Tutorial Part #9 - Filter DataFrame Rows
-
Pandas Tutorial Part #10 - Add/Remove DataFrame Rows & Columns
-
Pandas Tutorial Part #11 - DataFrame attributes & methods
-
Pandas Tutorial Part #12 - Handling Missing Data or NaN values
-
Pandas Tutorial Part #13 - Iterate over Rows & Columns of DataFrame
-
Pandas Tutorial Part #14 - Sorting DataFrame by Rows or Columns
-
Pandas Tutorial Part #15 - Merging or Concatenating DataFrames
-
Pandas Tutorial Part #16 - DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.