In this article we will discuss how to drop rows with all zeros in a pandas DataFrame.
A DataFrame is a data structure that stores the data in rows and columns. We can create a DataFrame using pandas.DataFrame() method. Let’s create a dataframe with 4 rows and 4 columns
import pandas as pd # Create Dataframe for with 0's df= pd.DataFrame({'one' :[0,0,55,0], 'two' :[0,1,0,0], 'three':[0,0,0,0], 'four' :[0,0,0,0]}) # Display the Dataframe print(df)
Output:
one two three four 0 0 0 0 0 1 0 1 0 0 2 55 0 0 0 3 0 0 0 0
Here the dataframe contains 2 rows with all zeros, so we have to remove these rows from the dataframe.
Remove rows with all zeros using loc[] in Dataframe
We are use Dataframe.loc[] attribute to get the rows which are not zeros. The loc[] is used to get the values based on the mentioned index location.
Syntax is as follows:
# Remove rows with all 0s in a Dataframe df = df.loc[(df != 0).any(axis=1)]
where, df is the input dataframe and parameters of loc[] attributes are,
Frequently Asked:
- axis = 1 specifies the row position
- (df != 0) is the condition to check data other than 0
Example: Drop rows with all zeros from the above dataframe
# Remove rows with all 0s in a Dataframe df = df.loc[(df != 0).any(axis=1)] # Display the Dataframe print(df)
Output:
one two three four 1 0 1 0 0 2 55 0 0 0
Here first and forth row contains all zeros, so the output is second and third row which includes not all zeros.
Remove rows with all zeros using ~ operator
We can use ~ for specifying a condition i.e. if rows are equal to 0.
Syntax is as follows
# Remove rows with all 0s in a Dataframe df = df[~(df == 0).all(axis=1)]
where, df is the input dataframe and the Parameters of loc[] attribute are:
- axis = 1 specifies the row position
- ~(df != 0) is the condition to check data other than 0
We have to finally specify the condition inside [].
Example: Drop rows with all zeros from the above dataframe
# Remove rows with all 0s in a Dataframe df = df[~(df == 0).all(axis=1)] # Display the Dataframe print(df)
Output:
one two three four 1 0 1 0 0 2 55 0 0 0
Here first and forth row contains all zeros, so the output is second and third row which includes not all zeros.
The complete example is as follows,
import pandas as pd # Create Dataframe for with 0's df= pd.DataFrame({'one' :[0,0,55,0], 'two' :[0,1,0,0], 'three':[0,0,0,0], 'four' :[0,0,0,0]}) # Display the Dataframe print(df) print('*** Example 1 ****') # Remove rows with all 0s in a Dataframe mod = df.loc[(df != 0).any(axis=1)] # Display the Dataframe print(mod) print('*** Example 2 ****') # Remove rows with all 0s in a Dataframe mod = df[~(df == 0).all(axis=1)] # Display the Dataframe print(mod)
Output:
one two three four 0 0 0 0 0 1 0 1 0 0 2 55 0 0 0 3 0 0 0 0 *** Example 1 **** one two three four 1 0 1 0 0 2 55 0 0 0 *** Example 2 **** one two three four 1 0 1 0 0 2 55 0 0 0
Summary:
We learned about two different ways to delete rows with all zero values from a Pandas Dataframe.