In this article, we will discuss different ways to convert a dataframe column into a list.
Fits of all, create a dataframe object that we are going to use in this example,
import pandas as pd # List of Tuples students = [('jack', 34, 'Sydney', 155), ('Riti', 31, 'Delhi', 177.5), ('Aadi', 16, 'Mumbai', 81), ('Mohit', 31, 'Delhi', 167), ('Veena', 12, 'Delhi', 144), ('Shaunak', 35, 'Mumbai', 135), ('Shaun', 35, 'Colombo', 111) ] # Create a DataFrame object student_df = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) print(student_df)
Output:
Name Age City Score 0 jack 34 Sydney 155.0 1 Riti 31 Delhi 177.5 2 Aadi 16 Mumbai 81.0 3 Mohit 31 Delhi 167.0 4 Veena 12 Delhi 144.0 5 Shaunak 35 Mumbai 135.0 6 Shaun 35 Colombo 111.0
Now how to fetch a single column out of this dataframe and convert it to a python list?
There are different ways to do that, lets discuss them one by one.
Convert a Dataframe column into a list using Series.to_list()
To turn the column ‘Name’ from the dataframe object student_df to a list in a single line,
# select a column as series and then convert it into a column list_of_names = student_df['Name'].to_list() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names))
Output
List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'>
What did happen here?
How did it work?
Let’s break down the above line into steps,
Step 1: Fetch a column as series
Select the column ‘Name’ from the dataframe using [] operator,
# Select column 'Name' as series object names = student_df['Name'] print(names) print(type(names))
Output:
0 jack 1 Riti 2 Aadi 3 Mohit 4 Veena 5 Shaunak 6 Shaun Name: Name, dtype: object <class 'pandas.core.series.Series'>
It returns a Series object names, and we have confirmed that by printing its type.
Step 2 : Convert the Series object to the list
Series class provides a function Series.to_list(), which returns the contents of Series object as list. Use that to convert series names into a list i.e.
# Convert series object to a list list_of_names = names.to_list() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names))
Output:
List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'>
This is how we converted a dataframe column into a list.
Important Note:
It might be possible that it gives you an error i.e.
AttributeError: 'Series' object has no attribute 'to_list'
If you get that error, then please check your Pandas version, you may be using pandas version less than 24.
Import pandas as pd print(pd.__version__)
Upgrade you pandas to the latest version using the following command,
pip install --upgrade pandas
Convert a Dataframe column into a list using numpy.ndarray.tolist()
Another way is converting a Dataframe column into a list is,
# Convert column Name to a Numpy Array and then to a list list_of_names = student_df['Name'].values.tolist() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names))
Output
List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'>
We converted the column ‘Name’ into a list in a single line. Let’s see what happened inside it,
How did it work?
Let’s break down the above line into steps,
Step 1: Select a column as a Series object
Select the column ‘Name’ from the dataframe using [] operator,
student_df['Name']
It returns a Series object.
Step 2: Get a Numpy array from a series object using Series.Values
# Select a column from dataframe as series and get a numpy array from that names = student_df['Name'].values print('Numpy array: ', names) print('Type of namesAsNumpy: ', type(names))
Output:
Numpy array: ['jack' 'Riti' 'Aadi' 'Mohit' 'Veena' 'Shaunak' 'Shaun'] Type of namesAsNumpy: <class 'numpy.ndarray'>
Names is a numpy array, and we confirmed it by printing its types.
Step 3: Convert a Numpy array into a list
Numpy array provides a function tolist() to convert its contents to a list,
# Convert numpy array to a list list_of_names = names.tolist() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names))
Output:
List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'>
This is how we selected our column ‘Name’ from Dataframe as a Numpy array and then turned it to a list.
The complete example is as follows,
import pandas as pd def main(): # List of Tuples students = [('jack', 34, 'Sydney', 155), ('Riti', 31, 'Delhi', 177.5), ('Aadi', 16, 'Mumbai', 81), ('Mohit', 31, 'Delhi', 167), ('Veena', 12, 'Delhi', 144), ('Shaunak', 35, 'Mumbai', 135), ('Shaun', 35, 'Colombo', 111) ] # Create a DataFrame object student_df = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) print("Contents of the Dataframe : ") print(student_df) print('Convert a Dataframe column into a list using Series.to_list()') # select a column as series and then convert it into a column list_of_names = student_df['Name'].to_list() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names)) print('How did it worked ?') # Select column 'Name' as series object names = student_df['Name'] print(names) print(type(names)) # Convert series object to a list list_of_names = names.to_list() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names)) print("Convert a Dataframe column into a list using numpy.ndarray.tolist()") # Convert column Name to a Numpy Array and then to a list list_of_names = student_df['Name'].values.tolist() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names)) print('How did it worked ?') # Select a column from dataframe as series and get a numpy array from that names = student_df['Name'].values print('Numpy array: ', names) print('Type of namesAsNumpy: ', type(names)) # Convert numpy array to a list list_of_names = names.tolist() print('List of Names: ', list_of_names) print('Type of listOfNames: ', type(list_of_names)) if __name__ == '__main__': main()
Output:
Contents of the Dataframe : Name Age City Score 0 jack 34 Sydney 155.0 1 Riti 31 Delhi 177.5 2 Aadi 16 Mumbai 81.0 3 Mohit 31 Delhi 167.0 4 Veena 12 Delhi 144.0 5 Shaunak 35 Mumbai 135.0 6 Shaun 35 Colombo 111.0 Convert a Dataframe column into a list using Series.to_list() List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'> How did it worked ? 0 jack 1 Riti 2 Aadi 3 Mohit 4 Veena 5 Shaunak 6 Shaun Name: Name, dtype: object <class 'pandas.core.series.Series'> List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'> Convert a Dataframe column into a list using numpy.ndarray.tolist() List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'> How did it worked ? Numpy array: ['jack' 'Riti' 'Aadi' 'Mohit' 'Veena' 'Shaunak' 'Shaun'] Type of namesAsNumpy: <class 'numpy.ndarray'> List of Names: ['jack', 'Riti', 'Aadi', 'Mohit', 'Veena', 'Shaunak', 'Shaun'] Type of listOfNames: <class 'list'>
Pandas Tutorials -Learn Data Analysis with Python
-
Pandas Tutorial Part #1 - Introduction to Data Analysis with Python
-
Pandas Tutorial Part #2 - Basics of Pandas Series
-
Pandas Tutorial Part #3 - Get & Set Series values
-
Pandas Tutorial Part #4 - Attributes & methods of Pandas Series
-
Pandas Tutorial Part #5 - Add or Remove Pandas Series elements
-
Pandas Tutorial Part #6 - Introduction to DataFrame
-
Pandas Tutorial Part #7 - DataFrame.loc[] - Select Rows / Columns by Indexing
-
Pandas Tutorial Part #8 - DataFrame.iloc[] - Select Rows / Columns by Label Names
-
Pandas Tutorial Part #9 - Filter DataFrame Rows
-
Pandas Tutorial Part #10 - Add/Remove DataFrame Rows & Columns
-
Pandas Tutorial Part #11 - DataFrame attributes & methods
-
Pandas Tutorial Part #12 - Handling Missing Data or NaN values
-
Pandas Tutorial Part #13 - Iterate over Rows & Columns of DataFrame
-
Pandas Tutorial Part #14 - Sorting DataFrame by Rows or Columns
-
Pandas Tutorial Part #15 - Merging or Concatenating DataFrames
-
Pandas Tutorial Part #16 - DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.