Pandas | Count non-zero values in Dataframe Column

This article will discuss how to count the number of non-zero values in one or more Dataframe columns in Pandas.

Let’s first create a Dataframe from a list of tuples,

import pandas as pd
import numpy as np

# List of Tuples
list_of_tuples = [  (11, 34,     0,  5,  11, 56),
                    (12, np.NaN, 0,  7,  12, 0),
                    (21, 0,      78, 0,  64, 0),
                    (0,  0,      0,  63, 0,  45) ,
                    (0,  34,     11, 0,  56, 0),
                    (12, 0,      12, 41, 0,  18)]


# Create a DataFrame object
df = pd.DataFrame(  list_of_tuples, 
                    columns=['A', 'B', 'C', 'D', 'E', 'F'])

print(df)

The contents of the Dataframe will be like this,

    A     B   C   D   E   F
0  11  34.0   0   5  11  56
1  12   NaN   0   7  12   0
2  21   0.0  78   0  64   0
3   0   0.0   0  63   0  45
4   0  34.0  11   0  56   0
5  12   0.0  12  41   0  18

This Dataframe has six columns, which contain certain numbers and few NaN values. Now let’s see how to count the number of non-zero values in any of the columns of this Dataframe.

Count non-zero values in a Dataframe column using Series.sum()

The steps are as follows,

Advertisements
  • Select the Dataframe column by its name i.e., df[‘B’].
  • Then apply a condition on it i.e. ( df[‘B’] != 0 ). It gives a bool Series object, where each True value indicates that the corresponding value in the column is non-zero.
  • Call sum() function on this bool Series object. It will give the count of total non-zero values in it, and that will be equal to the count of non-zero values in the selected column.

Let’s use this logic to get the count of total zero values in column ‘B’ of the Dataframe,

# Get the count of non-Zeros values in column 'B'  
count = (df['B'] != 0).sum()

print('Count of non zeros in Column  B : ', count)

Output:

Count of non zeros in Column  B :  3

It will include NaN values too in while calculation. Therefore it returned three as the count of non-zero values in column ‘B’. What if we want to include only non-NaN values in counting?

Count non-zero & non NaN values in a Dataframe column

The steps are as follows,

  • Select a subset of the Dataframe column as a Series object. This subset should contain only non-zero values.
  • Then call the count() function on this Series object, and it will give the count of non-zero values in the Dataframe column.

Let’s use this logic to get the count of total non zero & non-NaN values in column ‘B’ of the Dataframe,

# Get the count of non-Zeros and non NaN values in column 'B'  
column = df['B'] 
count = column[column != 0].count()

print('Count of non zeros & and non NaN in Column  B : ', count)

Output:

Count of non zeros & and non NaN in Column  B :  2

Count non-zero values in all Dataframe columns

Iterate over all column names of the Dataframe. For each column name, select the column and count the number of non-zeros in it using one of the previously mentioned techniques,

for column_name in df.columns:
    column = df[column_name]
    # Get the count of non-Zeros values in column
    count_of_non_zeros = (column != 0).sum()
    # Get the count of non-Zeros & non NaN values in column
    count_non_zeros_non_nan = column[column != 0].count()
    
    print(  'Count of non zeros in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)
    
    print(  'Count of non zeros & non NaN in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)

Output:

Count of non zeros in Column  A  is :  4
Count of non zeros & non NaN in Column  A  is :  4
Count of non zeros in Column  B  is :  3
Count of non zeros & non NaN in Column  B  is :  3
Count of non zeros in Column  C  is :  3
Count of non zeros & non NaN in Column  C  is :  3
Count of non zeros in Column  D  is :  4
Count of non zeros & non NaN in Column  D  is :  4
Count of non zeros in Column  E  is :  4
Count of non zeros & non NaN in Column  E  is :  4
Count of non zeros in Column  F  is :  3
Count of non zeros & non NaN in Column  F  is :  3

It printed the number of non-zeros & non-NaN values in all Dataframe columns.

The complete working example is as follows,

import pandas as pd
import numpy as np

# List of Tuples
list_of_tuples = [  (11, 34,     0,  5,  11, 56),
                    (12, np.NaN, 0,  7,  12, 0),
                    (21, 0,      78, 0,  64, 0),
                    (0,  0,      0,  63, 0,  45) ,
                    (0,  34,     11, 0,  56, 0),
                    (12, 0,      12, 41, 0,  18)]


# Create a DataFrame object
df = pd.DataFrame(  list_of_tuples, 
                    columns=['A', 'B', 'C', 'D', 'E', 'F'])

print(df)

# Get the count of non-Zeros values in column 'B'  
count = (df['B'] != 0).sum()

print('Count of non zeros in Column  B : ', count)


# Get the count of non-Zeros and non NaN values in column 'B'  
column = df['B'] 
count = column[column != 0].count()

print('Count of non zeros & and non NaN in Column  B : ', count)

'''
Get count of all non zero values inn each of the Dataframe column
'''

for column_name in df.columns:
    column = df[column_name]
    # Get the count of non-Zeros values in column
    count_of_non_zeros = (column != 0).sum()
    # Get the count of non-Zeros & non NaN values in column
    count_non_zeros_non_nan = column[column != 0].count()
    
    print(  'Count of non zeros in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)
    
    print(  'Count of non zeros & non NaN in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)

Output:

    A     B   C   D   E   F
0  11  34.0   0   5  11  56
1  12   NaN   0   7  12   0
2  21   0.0  78   0  64   0
3   0   0.0   0  63   0  45
4   0  34.0  11   0  56   0
5  12   0.0  12  41   0  18

Count of non zeros in Column  B :  3
Count of non zeros & and non NaN in Column  B :  2
Count of non zeros in Column  A  is :  4

Count of non zeros & non NaN in Column  A  is :  4
Count of non zeros in Column  B  is :  3
Count of non zeros & non NaN in Column  B  is :  3
Count of non zeros in Column  C  is :  3
Count of non zeros & non NaN in Column  C  is :  3
Count of non zeros in Column  D  is :  4
Count of non zeros & non NaN in Column  D  is :  4
Count of non zeros in Column  E  is :  4
Count of non zeros & non NaN in Column  E  is :  4
Count of non zeros in Column  F  is :  3
Count of non zeros & non NaN in Column  F  is :  3

Summary

Today we learned about the different ways to count non-zero values in Dataframe columns in Pandas.

Advertisements

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top