How to slice a pandas DataFrame column?

In this article, we will look at multiple ways to slice pandas DataFrame column. We will mainly use pandas.DataFrame.loc and pandas.DataFrame.iloc to slice the DataFrame columns.

Table Of Contents

Preparing DataSet

To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.

import pandas as pd

# List of Tuples
employees = [('Shubham', 'India', 'Tech India',   5),
            ('Riti', 'India', 'India' ,   7),
            ('Shanky', 'India', 'PMO' ,   2),
            ('Shreya', 'India', 'Design' ,   2),
            ('Aadi', 'US', 'Tech', 11),
            ('Sim', 'US', 'Tech', 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Location', 'Team', 'Experience'])
print(df)

Contents of the created dataframe are,

      Name Location        Team  Experience
0  Shubham    India  Tech India           5
1     Riti    India       India           7
2   Shanky    India         PMO           2
3   Shreya    India      Design           2
4     Aadi       US        Tech          11
5      Sim       US        Tech           4

Slice a fixed set of DataFrame columns

Let’s start by slicing a fixed set of DataFrame columns, say, we need to slice the columns – “Name”, “Team”, and “Experience” in the above DataFrame.

Advertisements

Before going through the code, let’s understand the two functions we are going use –
– pandas.DataFrame.loc: Used to slice the columns by providing the column names
– pandas.DataFrame.iloc: Used to slice the columns by providing the column indexes

Let’s achieve the above scenarios using both functions, starting with pandas.DataFrame.loc first.

# using loc
print (df.loc[:, ["Name", "Team", "Experience"]])

Output

      Name        Team  Experience
0  Shubham  Tech India           5
1     Riti       India           7
2   Shanky         PMO           2
3   Shreya      Design           2
4     Aadi        Tech          11
5      Sim        Tech           4

We have sliced the three required columns from the above DataFrame. Now, let’s do the same using the pandas.DataFrame.iloc function

# using iloc
print (df.iloc[:, [0,2,3]])

Output

      Name        Team  Experience
0  Shubham  Tech India           5
1     Riti       India           7
2   Shanky         PMO           2
3   Shreya      Design           2
4     Aadi        Tech          11
5      Sim        Tech           4

Slice a range of DataFrame columns

Let’s consider another scenario where we want to slice a range of columns, i.e., all the columns between two columns. We can again achieve that using both iloc and loc as below.

# using loc slice all the columns between Name and Team (both inclusive)
print (df.loc[:, "Name":"Team"])

Output

      Name Location        Team
0  Shubham    India  Tech India
1     Riti    India       India
2   Shanky    India         PMO
3   Shreya    India      Design
4     Aadi       US        Tech
5      Sim       US        Tech

Using iloc,

# using iloc slice all the columns between index 0 to 3 (both inclusive)
print (df.iloc[:, 0:3])

Output

      Name Location        Team
0  Shubham    India  Tech India
1     Riti    India       India
2   Shanky    India         PMO
3   Shreya    India      Design
4     Aadi       US        Tech
5      Sim       US        Tech

We can play around a little more, say, if we need to slice all the columns before the column “Team”, we can simply do as follows.

# using loc slice all the columns before Team
print (df.loc[:, :"Team"])

Output

      Name Location        Team
0  Shubham    India  Tech India
1     Riti    India       India
2   Shanky    India         PMO
3   Shreya    India      Design
4     Aadi       US        Tech
5      Sim       US        Tech

Also, we can do the reverse, by slicing all the columns after the column “Team” as below.

# using loc slice all the columns after Team
print (df.loc[:, "Team":])

Output

         Team  Experience
0  Tech India           5
1       India           7
2         PMO           2
3      Design           2
4        Tech          11
5        Tech           4

Note that we can get the same output using the iloc as well, there we would just need to replace the column name with the column index.

Slice every nth DataFrame column

We can also slice columns by selecting every nth column from the pandas DataFrame. Let’s consider an example, where we need to slice all the alternate columns from the pandas DataFrame.

# slice every 2nd column from the DataFrame
print (df.iloc[:, ::2])

Output

      Name        Team
0  Shubham  Tech India
1     Riti       India
2   Shanky         PMO
3   Shreya      Design
4     Aadi        Tech
5      Sim        Tech

The complete example is as follows,

import pandas as pd

# List of Tuples
employees = [('Shubham', 'India', 'Tech India',   5),
            ('Riti', 'India', 'India' ,   7),
            ('Shanky', 'India', 'PMO' ,   2),
            ('Shreya', 'India', 'Design' ,   2),
            ('Aadi', 'US', 'Tech', 11),
            ('Sim', 'US', 'Tech', 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Location', 'Team', 'Experience'])
print(df)

# using loc
print (df.loc[:, ["Name", "Team", "Experience"]])

# using iloc
print (df.iloc[:, [0,2,3]])

# using loc slice all the columns between Name and Team (both inclusive)
print (df.loc[:, "Name":"Team"])

# using iloc slice all the columns between index 0 to 3 (both inclusive)
print (df.iloc[:, 0:3])

# using loc slice all the columns before Team
print (df.loc[:, :"Team"])

# using loc slice all the columns after Team
print (df.loc[:, "Team":])

# slice every 2nd column from the DataFrame
print (df.iloc[:, ::2])

Summary

In this article, we have discussed multiple ways to slice a pandas DataFrame column. Thanks.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top