In this article we will discuss different ways to count number of all rows in a Dataframe or rows that satisfy a condition.
Let’s create a Dataframe,
# List of Tuples empoyees = [('jack', 34, 'Sydney', 5) , ('Riti', 31, 'Delhi' , 7) , ('Aadi', 16, np.NaN, 11) , ('Mohit', np.NaN,'Delhi' , 15) , ('Veena', 33, 'Delhi' , 4) , ('Shaunak', 35, 'Mumbai', np.NaN ), ('Shaun', 35, 'Colombo', 11) ] # Create a DataFrame object empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])
Contents of the dataframe empDfObj are,
Name Age City Experience a jack 34.0 Sydney 5.0 b Riti 31.0 Delhi 7.0 c Aadi 16.0 NaN 11.0 d Mohit NaN Delhi 15.0 e Veena 33.0 Delhi 4.0 f Shaunak 35.0 Mumbai NaN g Shaun 35.0 Colombo 11.0
Now let’s discuss different ways to count rows in this dataframe.
Count all rows in a Pandas Dataframe using Dataframe.shape
Dataframe.shape
Each Dataframe object has a member variable shape i.e. a tuple that contains dimensions of a dataframe like,
(Number_of_index, Number_of_columns)
First element of the tuple returned by Dataframe.shape contains the number of items in index in a dataframe i.e. basically the number of rows in the dataframe. Let’s use this to count number of rows in above created dataframe i.e.
# First index of tuple returned by shape contains the number of index/row in dataframe numOfRows = empDfObj.shape[0] print('Number of Rows in dataframe : ' , numOfRows)
Output:
Number of Rows in dataframe : 7
Count all rows in a Pandas Dataframe using Dataframe.index
Dataframe.index
Each Dataframe object has a member variable index that contains a sequence of index or row labels. We can calculate the length of that sequence to find out the number of rows in the dataframe i.e.
# Get row count of dataframe by finding the length of index labels numOfRows = len(empDfObj.index) print('Number of Rows in dataframe : ' , numOfRows)
Output:
Number of Rows in dataframe : 7
Count rows in a Pandas Dataframe that satisfies a condition using Dataframe.apply()
Using Dataframe.apply() we can apply a function to all the rows of a dataframe to find out if elements of rows satisfies a condition or not.
Based on the result it returns a bool series. By counting the number of True in the returned series we can find out the number of rows in dataframe that satisfies the condition.
Let’s see some examples,
Example 1:
Count the number of rows in a dataframe for which ‘Age’ column contains value more than 30 i.e.
# Get a bool series representing which row satisfies the condition i.e. True for # row in which value of 'Age' column is more than 30 seriesObj = empDfObj.apply(lambda x: True if x['Age'] > 30 else False , axis=1) # Count number of True in series numOfRows = len(seriesObj[seriesObj == True].index) print('Number of Rows in dataframe in which Age > 30 : ', numOfRows)
Output:
Number of Rows in dataframe in which Age > 30 : 5
Example 2:
Count the number of rows in a dataframe which contains 11 in any column i.e.
# Count number of rows in a dataframe that contains value 11 in any column seriesObj = empDfObj.apply(lambda x: True if 11 in list(x) else False, axis=1) numOfRows = len(seriesObj[seriesObj == True].index) print('Number of Rows in dataframe which contain 11 in any column : ', numOfRows)
Output:
Number of Rows in dataframe which contain 11 in any column : 2
Example 3:
Count the number of rows in a dataframe which contains NaN in any column i.e.
# Count number of rows in a dataframe that contains NaN any column seriesObj = empDfObj.apply(lambda x: x.isnull().any(), axis=1) numOfRows = len(seriesObj[seriesObj == True].index) print('Number of Rows in dataframe which contain NaN in any column : ', numOfRows)
Output:
Number of Rows in dataframe which contain NaN in any column : 3
Complete example is as follows
import pandas as pd import numpy as np def main(): print('Create a Dataframe') # List of Tuples empoyees = [('jack', 34, 'Sydney', 5) , ('Riti', 31, 'Delhi' , 7) , ('Aadi', 16, np.NaN, 11) , ('Mohit', np.NaN,'Delhi' , 15) , ('Veena', 33, 'Delhi' , 4) , ('Shaunak', 35, 'Mumbai', np.NaN ), ('Shaun', 35, 'Colombo', 11) ] # Create a DataFrame object empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g']) print("Contents of the Dataframe : ") print(empDfObj) print('**** Get the row count of a Dataframe using Dataframe.shape') # First index of tuple returned by shape contains the number of index/row in dataframe numOfRows = empDfObj.shape[0] print('Number of Rows in dataframe : ' , numOfRows) print('**** Get the row count of a Dataframe using Dataframe.index') # Get row count of dataframe by finding the length of index labels numOfRows = len(empDfObj.index) print('Number of Rows in dataframe : ' , numOfRows) print('**** Count Number of Rows in dataframe that satisfy a condition ****') # Get a bool series representing which row satisfies the condition i.e. True for # row in which value of 'Age' column is more than 30 seriesObj = empDfObj.apply(lambda x: True if x['Age'] > 30 else False , axis=1) # Count number of True in series numOfRows = len(seriesObj[seriesObj == True].index) print('Number of Rows in dataframe in which Age > 30 : ', numOfRows) print('**** Count Number of Rows in dataframe that contains a value ****') # Count number of rows in a dataframe that contains value 11 in any column seriesObj = empDfObj.apply(lambda x: True if 11 in list(x) else False, axis=1) numOfRows = len(seriesObj[seriesObj == True].index) print('Number of Rows in dataframe which contain 11 in any column : ', numOfRows) print('**** Count Number of Rows in dataframe that contains NaN ****') # Count number of rows in a dataframe that contains NaN any column seriesObj = empDfObj.apply(lambda x: x.isnull().any(), axis=1) numOfRows = len(seriesObj[seriesObj == True].index) print('Number of Rows in dataframe which contain NaN in any column : ', numOfRows) if __name__ == '__main__': main()
Output
Create a Dataframe Contents of the Dataframe : Name Age City Experience a jack 34.0 Sydney 5.0 b Riti 31.0 Delhi 7.0 c Aadi 16.0 NaN 11.0 d Mohit NaN Delhi 15.0 e Veena 33.0 Delhi 4.0 f Shaunak 35.0 Mumbai NaN g Shaun 35.0 Colombo 11.0 **** Get the row count of a Dataframe using Dataframe.shape Number of Rows in dataframe : 7 **** Get the row count of a Dataframe using Dataframe.index Number of Rows in dataframe : 7 **** Count Number of Rows in dataframe that satisfy a condition **** Number of Rows in dataframe in which Age > 30 : 5 **** Count Number of Rows in dataframe that contains a value **** Number of Rows in dataframe which contain 11 in any column : 2 **** Count Number of Rows in dataframe that contains NaN **** Number of Rows in dataframe which contain NaN in any column : 3
Pandas Tutorials Learn Data Analysis with Python

Pandas Tutorial Part #1  Introduction to Data Analysis with Python

Pandas Tutorial Part #2  Basics of Pandas Series

Pandas Tutorial Part #3  Get & Set Series values

Pandas Tutorial Part #4  Attributes & methods of Pandas Series

Pandas Tutorial Part #5  Add or Remove Pandas Series elements

Pandas Tutorial Part #6  Introduction to DataFrame

Pandas Tutorial Part #7  DataFrame.loc[]  Select Rows / Columns by Indexing

Pandas Tutorial Part #8  DataFrame.iloc[]  Select Rows / Columns by Label Names

Pandas Tutorial Part #9  Filter DataFrame Rows

Pandas Tutorial Part #10  Add/Remove DataFrame Rows & Columns

Pandas Tutorial Part #11  DataFrame attributes & methods

Pandas Tutorial Part #12  Handling Missing Data or NaN values

Pandas Tutorial Part #13  Iterate over Rows & Columns of DataFrame

Pandas Tutorial Part #14  Sorting DataFrame by Rows or Columns

Pandas Tutorial Part #15  Merging or Concatenating DataFrames

Pandas Tutorial Part #16  DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most soughtafter professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.
hi, thanks, good examples!
In example 1: “Count the number of rows in a dataframe for which ‘Age’ column contains value more than 30 i.e.” Is there a way to get the cumulative count for each row?
I have a similar problem where i want to caluclate all “A” in column “Result”. But i wan to know the count for each row, something like this:
Result A_count
C 0
B 0
A 1
B 1
A 2
and so on…
Thanks
Very useful!