Import multiple csv files into one DataFrame in Pandas

In this article, we will discuss different methods to load multiple csv files into one pandas DataFrame.

Table Of Contents

Preparing DataSet

To get started, we have created a few sample csv files in the current working directory. Below is a quick snippet of all the csv files.

df1.csv

Name,City,Team
Shubham,Bangalore,Tech
Rudra,Mumbai,Product

df2.csv

Advertisements
Name,City,Team
Adarsh,Bangalore,Tech
Ajay,Delhi,Tech

df3.csv

Name,City,Team
Shreya,Jaipur,Design
Sam,Delhi,Design

Let’s look at multiple methods to read all these csv files in a single pandas DataFrame.

Method 1: Using For Loop

The easiest method to execute any repetitive task is for loop. We can iteratively read all the csv files and then append them in a single DataFrame. Let’s try to understand using the code below.

import pandas as pd

# list of files to read
files = ["df1.csv", "df2.csv", "df3.csv"]

# create a empty DataFrame where we will append all the DataFrames
final_df = pd.DataFrame()

for file in files:
    # read and append the file
    final_df = pd.concat([final_df, pd.read_csv(file)], axis=0)

final_df = final_df.reset_index(drop=True)

print (final_df)

Output

      Name       City     Team
0  Shubham  Bangalore     Tech
1    Rudra     Mumbai  Product
2   Adarsh  Bangalore     Tech
3     Ajay      Delhi     Tech
4   Shreya     Jaipur   Design
5      Sam      Delhi   Design

As observed, the outputs of all the files are now appended in a single DataFrame (“final_df”).

Method 2: Using the map() function

Using the for loops is not a very efficient method of executing things. In this approach, we are going to replace the entire for loop with the map function.

import pandas as pd

# using the map function
final_df = pd.concat(map(pd.read_csv, ['df1.csv', 'df2.csv','df3.csv']))

final_df = final_df.reset_index(drop=True)

print (final_df)

Output

      Name       City     Team
0  Shubham  Bangalore     Tech
1    Rudra     Mumbai  Product
2   Adarsh  Bangalore     Tech
3     Ajay      Delhi     Tech
4   Shreya     Jaipur   Design
5      Sam      Delhi   Design

We have replaced that entire section of code with a single line using the map function, thus making the code efficient and clean.

Method 3: Using the dask library

Another efficient way is to use the dask library, which is far faster than the pandas. However, the syntax is very similar to the normal pandas, but the background functionality is much faster. Let’s take a look at the code below.

# import library
import dask.dataframe as dd

# read all csv files starting with "df"
df = dd.read_csv("df*.csv")

df = df.compute()

print(df)

Output

      Name       City     Team
0  Shubham  Bangalore     Tech
1    Rudra     Mumbai  Product
2   Adarsh  Bangalore     Tech
3     Ajay      Delhi     Tech
4   Shreya     Jaipur   Design
5      Sam      Delhi   Design

As observed, all the file’s output is now combined into a single DataFrame. We can convert it back to pandas DataFrame for further processes.

Summary

In this article, we have discussed multiple ways to import multiple csv files into one DataFrame in Pandas. Thanks.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top