In this article we will discuss how to Drop multiple columns in a pandas DataFrame in Python.
Table of Contents
- Drop multiple columns from Pandas Dataframe by Index Positions
- Drop multiple columns from Pandas Dataframe by Column names
- Drop multiple columns from Pandas Dataframe by Conditions
A DataFrame is a data structure that stores the data in rows and columns. We can create a DataFrame using pandas.DataFrame() method.
Let’s create a dataframe with 4 rows and 5 columns
import pandas as pd # Create a Dataframe with 4 rows and 5 columns df= pd.DataFrame({'one':[0,0,55,0], 'two':[0,1,0,0], 'three':[0,0,0,0], 'four':[0,0,0,0], 'five':[34,56,54,56]}) # Display the Dataframe print(df)
Output:
one two three four five 0 0 0 0 0 34 1 0 1 0 0 56 2 55 0 0 0 54 3 0 0 0 0 56
Drop multiple columns from DataFrame by index
Using drop() & Columns Attribute
In Pandas, the Dataframe provides a function drop() to remove the data from the given dataframe.
Syntax is as follows:
Frequently Asked:
- Drop first row of pandas dataframe (3 Ways)
- Create Pandas Dataframe with Random Numbers
- Replace column values based on conditions in Pandas
- Pandas Tutorial #3 – Get & Set Series values
dataframe.drop(axis)
where,
- df is the input dataframe
- axis specifies row/column
Using drop() with columns attribute
We are going to use columns attribute along with the drop() function to delete the multiple columns. Using columns attribute we can select some columns and then pass them to drop() function for deletion.
Syntax is as follows:
df.drop(df.columns[[indices]], axis = 1)
where, df is the input dataframe and other parameters in this expression are:
- axis = 1 specifies the column
- indices represents the number of column to be removed
Here indexing starts with 0.
Example: In this example, we are going to drop first three columns based on indices – 0,1,2
import pandas as pd # Create dataframe with 4 rows and 5 columns df= pd.DataFrame({'one':[0,0,55,0], 'two':[0,1,0,0], 'three':[0,0,0,0], 'four':[0,0,0,0], 'five':[34,56,54,56]}) # Display the Dataframe print(df) print('Modified dataframe: ') # Remove first three columns using index df = df.drop(df.columns[[0, 1, 2]], axis = 1) # Display the Dataframe print(df)
Output:
one two three four five 0 0 0 0 0 34 1 0 1 0 0 56 2 55 0 0 0 54 3 0 0 0 0 56 Modified dataframe: four five 0 0 34 1 0 56 2 0 54 3 0 56
Using drop() & iloc[] attribute
We are going to use iloc[] attribute to drop the multiple columns from a Pandas dataframe. Here we have to specify the column indices to be dropped in an slice operator.
Syntax is as follows:
df.drop(df.iloc[:,start:end], axis = 1)
where, df is the input dataframe and other parameters in this expression are,
- axis = 1 specifies the column
- start specifies starting index and end specifies last index position to be removed
Here indexing starts with 0.
Example: In this example, we are going to drop first three columns based on indices – 0,1,2
import pandas as pd # Create dataframe with 4 rows and 5 columns df= pd.DataFrame({'one':[0,0,55,0], 'two':[0,1,0,0], 'three':[0,0,0,0], 'four':[0,0,0,0], 'five':[34,56,54,56]}) # Display the Dataframe print(df) print('Modified dataframe: ') # Remove first three columns using index df = df.drop(df.iloc[:,0: 3], axis = 1) # Display the Dataframe print(df)
Output:
one two three four five 0 0 0 0 0 34 1 0 1 0 0 56 2 55 0 0 0 54 3 0 0 0 0 56 Modified dataframe: four five 0 0 34 1 0 56 2 0 54 3 0 56
Drop multiple columns from DataFrame by column names
Drop Multiple columns by name using drop()
Here we can remove multiple columns at a time by specifying column names.
Syntax:
df.drop(['column1','column2',..........,'column n'], axis = 1)
where,
- df is the input dataframe
- columns specifies the column names to be removed.
- axis=1 specifies the column.
Example : Here, we are going to remove first three columns
import pandas as pd # Create dataframe with 4 rows and 5 columns df= pd.DataFrame({'one':[0,0,55,0], 'two':[0,1,0,0], 'three':[0,0,0,0], 'four':[0,0,0,0], 'five':[34,56,54,56]}) # Display the Dataframe print(df) print('Modified dataframe: ') # Remove first three columns using column names df = df.drop(['one','two','three'], axis = 1) # Display the Dataframe print(df)
Output:
one two three four five 0 0 0 0 0 34 1 0 1 0 0 56 2 55 0 0 0 54 3 0 0 0 0 56 Modified dataframe: four five 0 0 34 1 0 56 2 0 54 3 0 56
Here, we removed the columns named ‘one’ ,’two’ and ‘three’.
Drop Multiple columns with loc[] function
Here we can remove multiple columns at a time by specifying column names in loc[] function.
Syntax is as follows:
df.drop(df.loc[:, 'column_start':'column_end'].columns, axis = 1)
where,
- df is the input dataframe
- column_start specifies the starting column
- column_end specifies th ending column
- axis=1 specifies the column axis
Example : Here, we are going to remove first two columns
import pandas as pd # Create dataframe with 4 rows and 5 columns df= pd.DataFrame({'one':[0,0,55,0], 'two':[0,1,0,0], 'three':[0,0,0,0], 'four':[0,0,0,0], 'five':[34,56,54,56]}) # Display the Dataframe print(df) print('Modified dataframe: ') # Remove first two columns using column names df = df.drop(df.loc[:, 'one':'two'].columns, axis = 1) # Display the Dataframe print(df)
Output:
one two three four five 0 0 0 0 0 34 1 0 1 0 0 56 2 55 0 0 0 54 3 0 0 0 0 56 Modified dataframe: three four five 0 0 0 34 1 0 0 56 2 0 0 54 3 0 0 56
Here, we removed the columns named ‘one’ ,’two’.
Drop multiple columns from DataFrame by condition
Iterate over all column names and for each column check the condition. If condition is True then delete that column using del. For example, let’s delete columns from dataframe whose names include the string ‘one’ or ‘two’.
import pandas as pd # Create dataframe with 4 rows and 5 columns df= pd.DataFrame({'one':[0,0,55,0], 'two':[0,1,0,0], 'three':[0,0,0,0], 'four':[0,0,0,0], 'five':[34,56,54,56]}) print(df) # Drop Columns by Condition # Remove columns whose names contains the string 'one' and 'two' for col in df.columns: if (('one' in col) or ('two' in col)): del df[col] print('Modified Dataframe') print(df)
Output:
one two three four five 0 0 0 0 0 34 1 0 1 0 0 56 2 55 0 0 0 54 3 0 0 0 0 56 Modified Dataframe three four five 0 0 0 34 1 0 0 56 2 0 0 54 3 0 0 56
Here we will remove columns with name one and two.
Summary
In this article, we disucssed how to drop multiple columns by index positions or names or based on conditions.