This article will discuss how to count the number of non-zero values in one or more Dataframe columns in Pandas.
Let’s first create a Dataframe from a list of tuples,
import pandas as pd import numpy as np # List of Tuples list_of_tuples = [ (11, 34, 0, 5, 11, 56), (12, np.NaN, 0, 7, 12, 0), (21, 0, 78, 0, 64, 0), (0, 0, 0, 63, 0, 45) , (0, 34, 11, 0, 56, 0), (12, 0, 12, 41, 0, 18)] # Create a DataFrame object df = pd.DataFrame( list_of_tuples, columns=['A', 'B', 'C', 'D', 'E', 'F']) print(df)
The contents of the Dataframe will be like this,
A B C D E F 0 11 34.0 0 5 11 56 1 12 NaN 0 7 12 0 2 21 0.0 78 0 64 0 3 0 0.0 0 63 0 45 4 0 34.0 11 0 56 0 5 12 0.0 12 41 0 18
This Dataframe has six columns, which contain certain numbers and few NaN values. Now let’s see how to count the number of non-zero values in any of the columns of this Dataframe.
Count non-zero values in a Dataframe column using Series.sum()
The steps are as follows,
- Select the Dataframe column by its name i.e., df[‘B’].
- Then apply a condition on it i.e. ( df[‘B’] != 0 ). It gives a bool Series object, where each True value indicates that the corresponding value in the column is non-zero.
- Call sum() function on this bool Series object. It will give the count of total non-zero values in it, and that will be equal to the count of non-zero values in the selected column.
Let’s use this logic to get the count of total zero values in column ‘B’ of the Dataframe,
# Get the count of non-Zeros values in column 'B' count = (df['B'] != 0).sum() print('Count of non zeros in Column B : ', count)
Output:
Frequently Asked:
- Select Rows from Pandas DataFrame based on column values
- Replace NaN values with empty string in Pandas
- Pandas: Get last N rows of dataframe
- Select Rows by column value in Pandas
Count of non zeros in Column B : 3
It will include NaN values too in while calculation. Therefore it returned three as the count of non-zero values in column ‘B’. What if we want to include only non-NaN values in counting?
Count non-zero & non NaN values in a Dataframe column
The steps are as follows,
- Select a subset of the Dataframe column as a Series object. This subset should contain only non-zero values.
- Then call the count() function on this Series object, and it will give the count of non-zero values in the Dataframe column.
Let’s use this logic to get the count of total non zero & non-NaN values in column ‘B’ of the Dataframe,
# Get the count of non-Zeros and non NaN values in column 'B' column = df['B'] count = column[column != 0].count() print('Count of non zeros & and non NaN in Column B : ', count)
Output:
Count of non zeros & and non NaN in Column B : 2
Count non-zero values in all Dataframe columns
Iterate over all column names of the Dataframe. For each column name, select the column and count the number of non-zeros in it using one of the previously mentioned techniques,
for column_name in df.columns: column = df[column_name] # Get the count of non-Zeros values in column count_of_non_zeros = (column != 0).sum() # Get the count of non-Zeros & non NaN values in column count_non_zeros_non_nan = column[column != 0].count() print( 'Count of non zeros in Column ', column_name, ' is : ', count_of_non_zeros) print( 'Count of non zeros & non NaN in Column ', column_name, ' is : ', count_of_non_zeros)
Output:
Count of non zeros in Column A is : 4 Count of non zeros & non NaN in Column A is : 4 Count of non zeros in Column B is : 3 Count of non zeros & non NaN in Column B is : 3 Count of non zeros in Column C is : 3 Count of non zeros & non NaN in Column C is : 3 Count of non zeros in Column D is : 4 Count of non zeros & non NaN in Column D is : 4 Count of non zeros in Column E is : 4 Count of non zeros & non NaN in Column E is : 4 Count of non zeros in Column F is : 3 Count of non zeros & non NaN in Column F is : 3
It printed the number of non-zeros & non-NaN values in all Dataframe columns.
The complete working example is as follows,
import pandas as pd import numpy as np # List of Tuples list_of_tuples = [ (11, 34, 0, 5, 11, 56), (12, np.NaN, 0, 7, 12, 0), (21, 0, 78, 0, 64, 0), (0, 0, 0, 63, 0, 45) , (0, 34, 11, 0, 56, 0), (12, 0, 12, 41, 0, 18)] # Create a DataFrame object df = pd.DataFrame( list_of_tuples, columns=['A', 'B', 'C', 'D', 'E', 'F']) print(df) # Get the count of non-Zeros values in column 'B' count = (df['B'] != 0).sum() print('Count of non zeros in Column B : ', count) # Get the count of non-Zeros and non NaN values in column 'B' column = df['B'] count = column[column != 0].count() print('Count of non zeros & and non NaN in Column B : ', count) ''' Get count of all non zero values inn each of the Dataframe column ''' for column_name in df.columns: column = df[column_name] # Get the count of non-Zeros values in column count_of_non_zeros = (column != 0).sum() # Get the count of non-Zeros & non NaN values in column count_non_zeros_non_nan = column[column != 0].count() print( 'Count of non zeros in Column ', column_name, ' is : ', count_of_non_zeros) print( 'Count of non zeros & non NaN in Column ', column_name, ' is : ', count_of_non_zeros)
Output:
A B C D E F 0 11 34.0 0 5 11 56 1 12 NaN 0 7 12 0 2 21 0.0 78 0 64 0 3 0 0.0 0 63 0 45 4 0 34.0 11 0 56 0 5 12 0.0 12 41 0 18 Count of non zeros in Column B : 3 Count of non zeros & and non NaN in Column B : 2 Count of non zeros in Column A is : 4 Count of non zeros & non NaN in Column A is : 4 Count of non zeros in Column B is : 3 Count of non zeros & non NaN in Column B is : 3 Count of non zeros in Column C is : 3 Count of non zeros & non NaN in Column C is : 3 Count of non zeros in Column D is : 4 Count of non zeros & non NaN in Column D is : 4 Count of non zeros in Column E is : 4 Count of non zeros & non NaN in Column E is : 4 Count of non zeros in Column F is : 3 Count of non zeros & non NaN in Column F is : 3
Summary
Today we learned about the different ways to count non-zero values in Dataframe columns in Pandas.