Drop multiple Columns from a Pandas DataFrame

In this article we will discuss how to Drop multiple columns in a pandas DataFrame in Python.

Table of Contents

A DataFrame is a data structure that stores the data in rows and columns. We can create a DataFrame using pandas.DataFrame() method.

Let’s create a dataframe with 4 rows and 5 columns

import pandas as pd

# Create a Dataframe with 4 rows and 5 columns
df= pd.DataFrame({'one':[0,0,55,0],
                  'two':[0,1,0,0],
                  'three':[0,0,0,0],
                  'four':[0,0,0,0],
                  'five':[34,56,54,56]})

# Display the Dataframe
print(df)

Output:

   one  two  three  four  five
0    0    0      0     0    34
1    0    1      0     0    56
2   55    0      0     0    54
3    0    0      0     0    56

Drop multiple columns from DataFrame by index

Using drop() & Columns Attribute

In Pandas, the Dataframe provides a function drop() to remove the data from the given dataframe.

Syntax is as follows:

dataframe.drop(axis)

where,

  • df is the input dataframe
  • axis specifies row/column

Using drop() with columns attribute

We are going to use columns attribute along with the drop() function to delete the multiple columns. Using columns attribute we can select some columns and then pass them to drop() function for deletion.

Syntax is as follows:

df.drop(df.columns[[indices]], axis = 1)

where, df is the input dataframe and other parameters in this expression are:

  • axis = 1 specifies the column
  • indices represents the number of column to be removed

Here indexing starts with 0.

Example: In this example, we are going to drop first three columns based on indices – 0,1,2

import pandas as pd

# Create dataframe with 4 rows and 5 columns
df= pd.DataFrame({'one':[0,0,55,0],
                  'two':[0,1,0,0],
                  'three':[0,0,0,0],
                  'four':[0,0,0,0],
                  'five':[34,56,54,56]})

# Display the Dataframe
print(df)

print('Modified dataframe: ')

# Remove first three columns using index
df = df.drop(df.columns[[0, 1, 2]], axis = 1)

# Display the Dataframe
print(df)

Output:

   one  two  three  four  five
0    0    0      0     0    34
1    0    1      0     0    56
2   55    0      0     0    54
3    0    0      0     0    56

Modified dataframe:

    four  five
0     0    34
1     0    56
2     0    54
3     0    56

Using drop() & iloc[] attribute

We are going to use iloc[] attribute to drop the multiple columns from a Pandas dataframe. Here we have to specify the column indices to be dropped in an slice operator.

Syntax is as follows:

df.drop(df.iloc[:,start:end], axis = 1)

where, df is the input dataframe and other parameters in this expression are,

  • axis = 1 specifies the column
  • start specifies starting index and end specifies last index position to be removed

Here indexing starts with 0.

Example: In this example, we are going to drop first three columns based on indices – 0,1,2

import pandas as pd

# Create dataframe with 4 rows and 5 columns
df= pd.DataFrame({'one':[0,0,55,0],
                  'two':[0,1,0,0],
                  'three':[0,0,0,0],
                  'four':[0,0,0,0],
                  'five':[34,56,54,56]})

# Display the Dataframe
print(df)

print('Modified dataframe: ')

# Remove first three columns using index
df = df.drop(df.iloc[:,0: 3], axis = 1)

# Display the Dataframe
print(df)

Output:

   one  two  three  four  five
0    0    0      0     0    34
1    0    1      0     0    56
2   55    0      0     0    54
3    0    0      0     0    56

Modified dataframe:

    four  five
0     0    34
1     0    56
2     0    54
3     0    56

Drop multiple columns from DataFrame by column names

Drop Multiple columns by name using drop()

Here we can remove multiple columns at a time by specifying column names.

Syntax:

df.drop(['column1','column2',..........,'column n'], axis = 1)

where,

  • df is the input dataframe
  • columns specifies the column names to be removed.
  • axis=1 specifies the column.

Example : Here, we are going to remove first three columns

import pandas as pd

# Create dataframe with 4 rows and 5 columns
df= pd.DataFrame({'one':[0,0,55,0],
                  'two':[0,1,0,0],
                  'three':[0,0,0,0],
                  'four':[0,0,0,0],
                  'five':[34,56,54,56]})

# Display the Dataframe
print(df)

print('Modified dataframe: ')

# Remove first three columns using column names
df = df.drop(['one','two','three'], axis = 1)

# Display the Dataframe
print(df)

Output:

   one  two  three  four  five
0    0    0      0     0    34
1    0    1      0     0    56
2   55    0      0     0    54
3    0    0      0     0    56

Modified dataframe:

    four  five
0     0    34
1     0    56
2     0    54
3     0    56

Here, we removed the columns named ‘one’ ,’two’ and ‘three’.

Drop Multiple columns with loc[] function

Here we can remove multiple columns at a time by specifying column names in loc[] function.

Syntax is as follows:

df.drop(df.loc[:, 'column_start':'column_end'].columns, axis = 1)

where,

  • df is the input dataframe
  • column_start specifies the starting column
  • column_end specifies th ending column
  • axis=1 specifies the column axis

Example : Here, we are going to remove first two columns

import pandas as pd

# Create dataframe with 4 rows and 5 columns
df= pd.DataFrame({'one':[0,0,55,0],
                  'two':[0,1,0,0],
                  'three':[0,0,0,0],
                  'four':[0,0,0,0],
                  'five':[34,56,54,56]})

# Display the Dataframe
print(df)

print('Modified dataframe: ')

# Remove first two columns using column names
df = df.drop(df.loc[:, 'one':'two'].columns, axis = 1)

# Display the Dataframe
print(df)

Output:

   one  two  three  four  five
0    0    0      0     0    34
1    0    1      0     0    56
2   55    0      0     0    54
3    0    0      0     0    56

Modified dataframe: 

   three  four  five
0      0     0    34
1      0     0    56
2      0     0    54
3      0     0    56

Here, we removed the columns named ‘one’ ,’two’.

Drop multiple columns from DataFrame by condition

Iterate over all column names and for each column check the condition. If condition is True then delete that column using del. For example, let’s delete columns from dataframe whose names include the string ‘one’ or ‘two’.

import pandas as pd

# Create dataframe with 4 rows and 5 columns
df= pd.DataFrame({'one':[0,0,55,0],
                  'two':[0,1,0,0],
                  'three':[0,0,0,0],
                  'four':[0,0,0,0],
                  'five':[34,56,54,56]})

print(df)


# Drop Columns by Condition
# Remove columns whose names contains the string 'one' and 'two'
for col in df.columns:
    if (('one' in col) or ('two' in col)):
        del df[col]

print('Modified Dataframe')

print(df)

Output:

   one  two  three  four  five
0    0    0      0     0    34
1    0    1      0     0    56
2   55    0      0     0    54
3    0    0      0     0    56

Modified Dataframe

   three  four  five
0      0     0    34
1      0     0    56
2      0     0    54
3      0     0    56

Here we will remove columns with name one and two.

Summary

In this article, we disucssed how to drop multiple columns by index positions or names or based on conditions.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top