In this article we will discuss six different techniques to iterate over a dataframe row by row. Then we will also discuss how to update the contents of a Dataframe while iterating over it row by row.

Suppose we have a dataframe i.e

# List of Tuples
empoyees = [('jack', 34, 'Sydney', 5) ,
           ('Riti', 31, 'Delhi' , 7) ,
           ('Aadi', 16, 'New York', 11)
            ]

# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])

Contents of the created dataframe are,
   Name  Age      City  Experience
a  jack   34    Sydney           5
b  Riti   31     Delhi           7
c  Aadi   16  New York          11

Let’s see different ways to iterate over the rows of this dataframe,

Iterate over rows of a dataframe using DataFrame.iterrows()

Dataframe class provides a member function iterrows() i.e.

DataFrame.iterrows()

It yields an iterator which can can be used to iterate over all the rows of a dataframe in tuples. For each row it returns a tuple containing the index label and row contents as series.

Let’s iterate over all the rows of above created dataframe using iterrows() i.e.

# Yields a tuple of index label and series for each row in the datafra,e
for (index_label, row_series) in empDfObj.iterrows():
   print('Row Index label : ', index_label)
   print('Row Content as Series : ', row_series.values)

Output:
Row Index label :  a
Row Content as Series :  ['jack' 34 'Sydney' 5]
Row Index label :  b
Row Content as Series :  ['Riti' 31 'Delhi' 7]
Row Index label :  c
Row Content as Series :  ['Aadi' 16 'New York' 11]

Important points about Dataframe.iterrows()

  • Do not Preserve the data types:
    • As iterrows() returns each row contents as series but it does not preserve dtypes of values in the rows.
  • We can not modify something while iterating over the rows using iterrows(). The iterator does not returns a view instead it returns a copy. So, making any modification in returned row contents will have no effect on actual dataframe

Iterate over rows of a dataframe using DataFrame.itertuples()

Dataframe class provides a member function itertuples() i.e.

DataFrame.itertuples()

For each row it yields a named tuple containing the all the column names and their value for that row. Let’s use it to iterate over all the rows of above created dataframe i.e.

# Iterate over the Dataframe rows as named tuples
for namedTuple in empDfObj.itertuples():
   #Print row contents inside the named tuple
   print(namedTuple)

Output:
Pandas(Index='a', Name='jack', Age=34, City='Sydney', Experience=5)
Pandas(Index='b', Name='Riti', Age=31, City='Delhi', Experience=7)
Pandas(Index='c', Name='Aadi', Age=16, City='New York', Experience=11)

For every row in the dataframe a named tuple is returned. From named tuple you can access the individual values by indexing i.e.
To access the 1st value i.e. value with tag ‘index’ use,
namedTuple[0]

To access the 2nd value i.e. value with tag ‘Name’ use
namedTuple[1]

Named Tuples without index 

If we don’t want index column to be included in these named tuple then we can pass argument index=False i.e.

# Iterate over the Dataframe rows as named tuples without index
for namedTuple in empDfObj.itertuples(index=False):
   # Print row contents inside the named tuple
   print(namedTuple)

Output:
Pandas(Name='jack', Age=34, City='Sydney', Experience=5)
Pandas(Name='Riti', Age=31, City='Delhi', Experience=7)
Pandas(Name='Aadi', Age=16, City='New York', Experience=11)

Named Tuples with custom names

By default named tuple returned is with name Pandas, we can provide our custom names too by providing name argument i.e.

# Give Custom Name to the tuple while Iterating over the Dataframe rows
for row in empDfObj.itertuples(name='Employee'):
   # Print row contents inside the named tuple
   print(row)

Output:
Employee(Index='a', Name='jack', Age=34, City='Sydney', Experience=5)
Employee(Index='b', Name='Riti', Age=31, City='Delhi', Experience=7)
Employee(Index='c', Name='Aadi', Age=16, City='New York', Experience=11)

Iterate over rows in dataframe as dictionary

We can also iterate over the rows of dataframe and convert them to dictionary for accessing by column label using same itertuples() i.e.

# itertuples() yields an iterate to named tuple
for row in empDfObj.itertuples(name='Employee'):
   # Convert named tuple to dictionary
   dictRow = row._asdict()
   # Print dictionary
   print(dictRow)
   # Access elements from dict i.e. row contents
   print(dictRow['Name'] , ' is from ' , dictRow['City'])

Output:
OrderedDict([('Index', 'a'), ('Name', 'jack'), ('Age', 34), ('City', 'Sydney'), ('Experience', 5)])
jack  is from  Sydney
OrderedDict([('Index', 'b'), ('Name', 'Riti'), ('Age', 31), ('City', 'Delhi'), ('Experience', 7)])
Riti  is from  Delhi
OrderedDict([('Index', 'c'), ('Name', 'Aadi'), ('Age', 16), ('City', 'New York'), ('Experience', 11)])
Aadi  is from  New York

Iterate over rows in dataframe using index position and iloc

We can calculate the number of rows in a dataframe. Then loop through 0th index to last row and access each row by index position using iloc[] i.e.

# Loop through rows of dataframe by index i.e. from 0 to number of rows
for i in range(0, empDfObj.shape[0]):
   # get row contents as series using iloc{] and index position of row
   rowSeries = empDfObj.iloc[i]
   # print row contents
   print(rowSeries.values)

Output:
['jack' 34 'Sydney' 5]
['Riti' 31 'Delhi' 7]
['Aadi' 16 'New York' 11]

Iterate over rows in dataframe in reverse using index position and iloc

Get the number of rows in a dataframe. Then loop through last index to 0th index and access each row by index position using iloc[] i.e.

# Loop through rows of dataframe by index in reverse i.e. from last row to row at 0th index.
for i in range(empDfObj.shape[0] - 1, -1, -1):
   # get row contents as series using iloc{] and index position of row
   rowSeries = empDfObj.iloc[i]
   # print row contents
   print(rowSeries.values)

Output:
['Aadi' 16 'New York' 11]
['Riti' 31 'Delhi' 7]
['jack' 34 'Sydney' 5]

Iterate over rows in dataframe using index labels and loc[]

As Dataframe.index returns a sequence of index labels, so we can iterate over those labels and access each row by index label i.e.

# loop through all the names in index label sequence of dataframe
for index in empDfObj.index:
   # For each index label, access the row contents as series
   rowSeries = empDfObj.loc[index]
   # print row contents
   print(rowSeries.values)

Output:
['jack' 34 'Sydney' 5]
['Riti' 31 'Delhi' 7]
['Aadi' 16 'New York' 11]

Update contents a dataframe While iterating row by row

As Dataframe.iterrows() returns a copy of the dataframe contents in tuple, so updating it will have no effect on actual dataframe. So, to update the contents of dataframe we need to iterate over the rows of dataframe using iterrows() and then access earch row using at() to update it’s contents.

Let’s see an example,

Suppose we have a dataframe i.e

# List of Tuples
salaries = [(11, 5, 70000, 1000) ,
           (12, 7, 72200, 1100) ,
           (13, 11, 84999, 1000)
           ]

# Create a DataFrame object
salaryDfObj = pd.DataFrame(salaries, columns=['ID', 'Experience' , 'Salary', 'Bonus'])

Contents of the created dataframe salaryDfObj are,
   ID  Experience  Salary  Bonus
0  11           5   70000   1000
1  12           7   72200   1100
2  13          11   84999   1000

Let’s update each value in column ‘Bonus’ by multiplying it with 2 while iterating over the dataframe row by row i.e.
# iterate over the dataframe row by row
for index_label, row_series in salaryDfObj.iterrows():
   # For each row update the 'Bonus' value to it's double
   salaryDfObj.at[index_label , 'Bonus'] = row_series['Bonus'] * 2

Output:
   ID  Experience  Salary  Bonus
0  11           5   70000   2000
1  12           7   72200   2200
2  13          11   84999   2000

Complete example is as follows,
import pandas as pd

def main():

    # List of Tuples
    empoyees = [('jack', 34, 'Sydney', 5) ,
               ('Riti', 31, 'Delhi' , 7) ,
               ('Aadi', 16, 'New York', 11)
                ]

    # Create a DataFrame object
    empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])
    print("Contents of the Dataframe : ")
    print(empDfObj)

    print('**** Iterate over rows in a dataframe using Dataframe.iterrows() ****')

    # Yields a tuple of index label and series for each row in the datafra,e
    for (index_label, row_series) in empDfObj.iterrows():
       print('Row Index label : ', index_label)
       print('Row Content as Series : ', row_series.values)

    print('**** Iterate over rows in dataframe using Dataframe.itertuples() ****')

    print('All rows of Dataframe as named tuple :')

    # Iterate over the Dataframe rows as named tuples
    for namedTuple in empDfObj.itertuples():
       #Print row contents inside the named tuple
       print(namedTuple)


    print('All rows of Dataframe ( without index column values) as named tuple :')

    # Iterate over the Dataframe rows as named tuples without index
    for namedTuple in empDfObj.itertuples(index=False):
       # Print row contents inside the named tuple
       print(namedTuple)

    print('All rows of Dataframe as named tuple with custom name  "Employee" :')

    # Give Custom Name to the tuple while Iterating over the Dataframe rows
    for row in empDfObj.itertuples(name='Employee'):
       # Print row contents inside the named tuple
       print(row)

    print('**** Iterate over rows in dataframe as dictionary using Dataframe.itertuples() ****')

    # itertuples() yields an iterate to named tuple
    for row in empDfObj.itertuples(name='Employee'):
       # Convert named tuple to dictionary
       dictRow = row._asdict()
       # Print dictionary
       print(dictRow)
       # Access elements from dict i.e. row contents
       print(dictRow['Name'] , ' is from ' , dictRow['City'])



    print('**** Iterate over rows in dataframe using index position ****')

    # Loop through rows of dataframe by index i.e. from 0 to number of rows
    for i in range(0, empDfObj.shape[0]):
       # get row contents as series using iloc{] and index position of row
       rowSeries = empDfObj.iloc[i]
       # print row contents
       print(rowSeries.values)

    print('**** Iterate over rows in dataframe in reverse order using index position ****')

    # Loop through rows of dataframe by index in reverse i.e. from last row to row at 0th index.
    for i in range(empDfObj.shape[0] - 1, -1, -1):
       # get row contents as series using iloc{] and index position of row
       rowSeries = empDfObj.iloc[i]
       # print row contents
       print(rowSeries.values)


    print('**** Iterate over rows in dataframe using index labels ****')

    # loop through all the names in index label sequence of dataframe
    for index in empDfObj.index:
       # For each index label, access the row contents as series
       rowSeries = empDfObj.loc[index]
       # print row contents
       print(rowSeries.values)

    print('**** Update contents a dataframe While iterating row by row ****')

    print('Create a New dataframe')

    # List of Tuples
    salaries = [(11, 5, 70000, 1000) ,
               (12, 7, 72200, 1100) ,
               (13, 11, 84999, 1000)
               ]

    # Create a DataFrame object
    salaryDfObj = pd.DataFrame(salaries, columns=['ID', 'Experience' , 'Salary', 'Bonus'])
    print("Contents of the Dataframe : ")
    print(salaryDfObj)

    print('Multiply values in Bonus column by 2 while iterating over the datafarme')

    # iterate over the dataframe row by row
    for index_label, row_series in salaryDfObj.iterrows():
       # For each row update the 'Bonus' value to it's double
       salaryDfObj.at[index_label , 'Bonus'] = row_series['Bonus'] * 2

    print("Contents of the Modified Dataframe : ")
    print(salaryDfObj)

if __name__ == '__main__':
  main()


Output:
Contents of the Dataframe : 
   Name  Age      City  Experience
a  jack   34    Sydney           5
b  Riti   31     Delhi           7
c  Aadi   16  New York          11
**** Iterate over rows in a dataframe using Dataframe.iterrows() ****
Row Index label :  a
Row Content as Series :  ['jack' 34 'Sydney' 5]
Row Index label :  b
Row Content as Series :  ['Riti' 31 'Delhi' 7]
Row Index label :  c
Row Content as Series :  ['Aadi' 16 'New York' 11]
**** Iterate over rows in dataframe using Dataframe.itertuples() ****
All rows of Dataframe as named tuple :
Pandas(Index='a', Name='jack', Age=34, City='Sydney', Experience=5)
Pandas(Index='b', Name='Riti', Age=31, City='Delhi', Experience=7)
Pandas(Index='c', Name='Aadi', Age=16, City='New York', Experience=11)
All rows of Dataframe ( without index column values) as named tuple :
Pandas(Name='jack', Age=34, City='Sydney', Experience=5)
Pandas(Name='Riti', Age=31, City='Delhi', Experience=7)
Pandas(Name='Aadi', Age=16, City='New York', Experience=11)
All rows of Dataframe as named tuple with custom name  "Employee" :
Employee(Index='a', Name='jack', Age=34, City='Sydney', Experience=5)
Employee(Index='b', Name='Riti', Age=31, City='Delhi', Experience=7)
Employee(Index='c', Name='Aadi', Age=16, City='New York', Experience=11)
**** Iterate over rows in dataframe as dictionary using Dataframe.itertuples() ****
OrderedDict([('Index', 'a'), ('Name', 'jack'), ('Age', 34), ('City', 'Sydney'), ('Experience', 5)])
jack  is from  Sydney
OrderedDict([('Index', 'b'), ('Name', 'Riti'), ('Age', 31), ('City', 'Delhi'), ('Experience', 7)])
Riti  is from  Delhi
OrderedDict([('Index', 'c'), ('Name', 'Aadi'), ('Age', 16), ('City', 'New York'), ('Experience', 11)])
Aadi  is from  New York
**** Iterate over rows in dataframe using index position ****
['jack' 34 'Sydney' 5]
['Riti' 31 'Delhi' 7]
['Aadi' 16 'New York' 11]
**** Iterate over rows in dataframe in reverse order using index position ****
['Aadi' 16 'New York' 11]
['Riti' 31 'Delhi' 7]
['jack' 34 'Sydney' 5]
**** Iterate over rows in dataframe using index labels ****
['jack' 34 'Sydney' 5]
['Riti' 31 'Delhi' 7]
['Aadi' 16 'New York' 11]
**** Update contents a dataframe While iterating row by row ****
Create a New dataframe
Contents of the Dataframe : 
   ID  Experience  Salary  Bonus
0  11           5   70000   1000
1  12           7   72200   1100
2  13          11   84999   1000
Multiply values in Bonus column by 2 while iterating over the datafarme
Contents of the Modified Dataframe : 
   ID  Experience  Salary  Bonus
0  11           5   70000   2000
1  12           7   72200   2200
2  13          11   84999   2000

 

Join a list of 2000+ Programmers for latest Tips & Tutorials