In this article we will discuss six different techniques to iterate over a dataframe row by row. Then we will also discuss how to update the contents of a Dataframe while iterating over it row by row.

Suppose we have a dataframe i.e

Contents of the created dataframe are,

Let’s see different ways to iterate over the rows of this dataframe,

Iterate over rows of a dataframe using DataFrame.iterrows()

Dataframe class provides a member function iterrows() i.e.

DataFrame.iterrows()

It yields an iterator which can can be used to iterate over all the rows of a dataframe in tuples. For each row it returns a tuple containing the index label and row contents as series.

Let’s iterate over all the rows of above created dataframe using iterrows() i.e.

Output:

Important points about Dataframe.iterrows()

  • Do not Preserve the data types:
    • As iterrows() returns each row contents as series but it does not preserve dtypes of values in the rows.
  • We can not modify something while iterating over the rows using iterrows(). The iterator does not returns a view instead it returns a copy. So, making any modification in returned row contents will have no effect on actual dataframe

Iterate over rows of a dataframe using DataFrame.itertuples()

Dataframe class provides a member function itertuples() i.e.

DataFrame.itertuples()

For each row it yields a named tuple containing the all the column names and their value for that row. Let’s use it to iterate over all the rows of above created dataframe i.e.

Output:

For every row in the dataframe a named tuple is returned. From named tuple you can access the individual values by indexing i.e.
To access the 1st value i.e. value with tag ‘index’ use,

To access the 2nd value i.e. value with tag ‘Name’ use

Named Tuples without index 

If we don’t want index column to be included in these named tuple then we can pass argument index=False i.e.

Output:

Named Tuples with custom names

By default named tuple returned is with name Pandas, we can provide our custom names too by providing name argument i.e.

Output:

Iterate over rows in dataframe as dictionary

We can also iterate over the rows of dataframe and convert them to dictionary for accessing by column label using same itertuples() i.e.

Output:

Iterate over rows in dataframe using index position and iloc

We can calculate the number of rows in a dataframe. Then loop through 0th index to last row and access each row by index position using iloc[] i.e.

Output:

Iterate over rows in dataframe in reverse using index position and iloc

Get the number of rows in a dataframe. Then loop through last index to 0th index and access each row by index position using iloc[] i.e.

Output:

Iterate over rows in dataframe using index labels and loc[]

As Dataframe.index returns a sequence of index labels, so we can iterate over those labels and access each row by index label i.e.

Output:

Update contents a dataframe While iterating row by row

As Dataframe.iterrows() returns a copy of the dataframe contents in tuple, so updating it will have no effect on actual dataframe. So, to update the contents of dataframe we need to iterate over the rows of dataframe using iterrows() and then access earch row using at() to update it’s contents.

Let’s see an example,

Suppose we have a dataframe i.e

Contents of the created dataframe salaryDfObj are,

Let’s update each value in column ‘Bonus’ by multiplying it with 2 while iterating over the dataframe row by row i.e.

Output:

Complete example is as follows,

Output:

 

Join LinkedIn Group of Python Professional Developers who wish to expand their network and share ideas.

You can also follow us On Twitter :

Click Here to Subscribe for more Articles / Tutorials like this.