Pandas | Count non-zero values in Dataframe Column

This article will discuss how to count the number of non-zero values in one or more Dataframe columns in Pandas.

Let’s first create a Dataframe from a list of tuples,

import pandas as pd
import numpy as np

# List of Tuples
list_of_tuples = [  (11, 34,     0,  5,  11, 56),
                    (12, np.NaN, 0,  7,  12, 0),
                    (21, 0,      78, 0,  64, 0),
                    (0,  0,      0,  63, 0,  45) ,
                    (0,  34,     11, 0,  56, 0),
                    (12, 0,      12, 41, 0,  18)]


# Create a DataFrame object
df = pd.DataFrame(  list_of_tuples, 
                    columns=['A', 'B', 'C', 'D', 'E', 'F'])

print(df)

The contents of the Dataframe will be like this,

    A     B   C   D   E   F
0  11  34.0   0   5  11  56
1  12   NaN   0   7  12   0
2  21   0.0  78   0  64   0
3   0   0.0   0  63   0  45
4   0  34.0  11   0  56   0
5  12   0.0  12  41   0  18

This Dataframe has six columns, which contain certain numbers and few NaN values. Now let’s see how to count the number of non-zero values in any of the columns of this Dataframe.

Count non-zero values in a Dataframe column using Series.sum()

The steps are as follows,

Advertisements
  • Select the Dataframe column by its name i.e., df[‘B’].
  • Then apply a condition on it i.e. ( df[‘B’] != 0 ). It gives a bool Series object, where each True value indicates that the corresponding value in the column is non-zero.
  • Call sum() function on this bool Series object. It will give the count of total non-zero values in it, and that will be equal to the count of non-zero values in the selected column.

Let’s use this logic to get the count of total zero values in column ‘B’ of the Dataframe,

# Get the count of non-Zeros values in column 'B'  
count = (df['B'] != 0).sum()

print('Count of non zeros in Column  B : ', count)

Output:

Count of non zeros in Column  B :  3

It will include NaN values too in while calculation. Therefore it returned three as the count of non-zero values in column ‘B’. What if we want to include only non-NaN values in counting?

Count non-zero & non NaN values in a Dataframe column

The steps are as follows,

  • Select a subset of the Dataframe column as a Series object. This subset should contain only non-zero values.
  • Then call the count() function on this Series object, and it will give the count of non-zero values in the Dataframe column.

Let’s use this logic to get the count of total non zero & non-NaN values in column ‘B’ of the Dataframe,

# Get the count of non-Zeros and non NaN values in column 'B'  
column = df['B'] 
count = column[column != 0].count()

print('Count of non zeros & and non NaN in Column  B : ', count)

Output:

Count of non zeros & and non NaN in Column  B :  2

Count non-zero values in all Dataframe columns

Iterate over all column names of the Dataframe. For each column name, select the column and count the number of non-zeros in it using one of the previously mentioned techniques,

for column_name in df.columns:
    column = df[column_name]
    # Get the count of non-Zeros values in column
    count_of_non_zeros = (column != 0).sum()
    # Get the count of non-Zeros & non NaN values in column
    count_non_zeros_non_nan = column[column != 0].count()
    
    print(  'Count of non zeros in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)
    
    print(  'Count of non zeros & non NaN in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)

Output:

Count of non zeros in Column  A  is :  4
Count of non zeros & non NaN in Column  A  is :  4
Count of non zeros in Column  B  is :  3
Count of non zeros & non NaN in Column  B  is :  3
Count of non zeros in Column  C  is :  3
Count of non zeros & non NaN in Column  C  is :  3
Count of non zeros in Column  D  is :  4
Count of non zeros & non NaN in Column  D  is :  4
Count of non zeros in Column  E  is :  4
Count of non zeros & non NaN in Column  E  is :  4
Count of non zeros in Column  F  is :  3
Count of non zeros & non NaN in Column  F  is :  3

It printed the number of non-zeros & non-NaN values in all Dataframe columns.

The complete working example is as follows,

import pandas as pd
import numpy as np

# List of Tuples
list_of_tuples = [  (11, 34,     0,  5,  11, 56),
                    (12, np.NaN, 0,  7,  12, 0),
                    (21, 0,      78, 0,  64, 0),
                    (0,  0,      0,  63, 0,  45) ,
                    (0,  34,     11, 0,  56, 0),
                    (12, 0,      12, 41, 0,  18)]


# Create a DataFrame object
df = pd.DataFrame(  list_of_tuples, 
                    columns=['A', 'B', 'C', 'D', 'E', 'F'])

print(df)

# Get the count of non-Zeros values in column 'B'  
count = (df['B'] != 0).sum()

print('Count of non zeros in Column  B : ', count)


# Get the count of non-Zeros and non NaN values in column 'B'  
column = df['B'] 
count = column[column != 0].count()

print('Count of non zeros & and non NaN in Column  B : ', count)

'''
Get count of all non zero values inn each of the Dataframe column
'''

for column_name in df.columns:
    column = df[column_name]
    # Get the count of non-Zeros values in column
    count_of_non_zeros = (column != 0).sum()
    # Get the count of non-Zeros & non NaN values in column
    count_non_zeros_non_nan = column[column != 0].count()
    
    print(  'Count of non zeros in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)
    
    print(  'Count of non zeros & non NaN in Column ',
            column_name,
            ' is : ',
            count_of_non_zeros)

Output:

    A     B   C   D   E   F
0  11  34.0   0   5  11  56
1  12   NaN   0   7  12   0
2  21   0.0  78   0  64   0
3   0   0.0   0  63   0  45
4   0  34.0  11   0  56   0
5  12   0.0  12  41   0  18

Count of non zeros in Column  B :  3
Count of non zeros & and non NaN in Column  B :  2
Count of non zeros in Column  A  is :  4

Count of non zeros & non NaN in Column  A  is :  4
Count of non zeros in Column  B  is :  3
Count of non zeros & non NaN in Column  B  is :  3
Count of non zeros in Column  C  is :  3
Count of non zeros & non NaN in Column  C  is :  3
Count of non zeros in Column  D  is :  4
Count of non zeros & non NaN in Column  D  is :  4
Count of non zeros in Column  E  is :  4
Count of non zeros & non NaN in Column  E  is :  4
Count of non zeros in Column  F  is :  3
Count of non zeros & non NaN in Column  F  is :  3

Summary

Today we learned about the different ways to count non-zero values in Dataframe columns in Pandas.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top