Find max column value & return corresponding rows in Pandas

In this article, we will discuss different ways to find the maximum value of a column and return its corresponding rows in a Pandas PataFrame.

Table of Contents

Preparing dataset

To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.

import pandas as pd

# List of Tuples
employees= [('Shubham', 'India', 'Tech',   5, 4),
            ('Riti', 'India', 'Design' ,   7, 7),
            ('Shanky', 'India', 'PMO' ,   2, 2),
            ('Shreya', 'India', 'Design' ,   2, 0),
            ('Aadi', 'US', 'PMO', 11, 5),
            ('Sim', 'US', 'Tech', 4, 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Location', 'Team', 'Experience', 'RelevantExperience'],
                  index = ['A', 'B', 'C', 'D', 'E', 'F'])
print(df)

Contents of the created dataframe are,

      Name Location    Team  Experience  RelevantExperience
A  Shubham    India    Tech           5                   4
B     Riti    India  Design           7                   7
C   Shanky    India     PMO           2                   2
D   Shreya    India  Design           2                   0
E     Aadi       US     PMO          11                   5
F      Sim       US    Tech           4                   4

Method 1: Using nlargest method

The nlargest() method is the simplest way to get the n rows with the largest values in a particular column. Let’s understand with an example, say, we need to extract the row with the highest value in the “Experience” column.

Advertisements
# get row with highest experience
print(df.nlargest(1, 'Experience'))

Output

   Name Location Team  Experience  RelevantExperience
E  Aadi       US  PMO          11                   5

As observed, it has found the maximum value of Experience column and returned its corresponding row.

Method 2: Using pandas.DataFrame.column.max() method

In this approach, we are going to separately find the maximum value of the “Experience” column, and then use index filtering to print the corresponding row. Let’s look at it step by step.

# get the maximum value 
print (df['Experience'].max())

Output

11

We have the maximum value of the “Experience” column. Now, let’s filter the row containing this value.

# get row with maximum value
print (df[df['Experience'] == df['Experience'].max()])

Output

   Name Location Team  Experience  RelevantExperience
E  Aadi       US  PMO          11                   5

Method 3: Using argmax method

Another method is to use the argmax() function which returns the index directly containing the maximum value. Once we have the index, we can use the iloc to get the row corresponding to that index position.

# get row with highest experience
print (df.iloc[df['Experience'].argmax()])

Output

Name                  Aadi
Location                US
Team                   PMO
Experience              11
RelevantExperience       5
Name: E, dtype: object

Method4: Using idxmax method

idxmax() is a similar method as argmax(), but instead of returning the index position, the idxmax() function returns the index name. Therefore, we will have to use the loc function here to get the corresponding row.

# get row with highest experience
print (df.loc[df['Experience'].idxmax()])

Output

Name                  Aadi
Location                US
Team                   PMO
Experience              11
RelevantExperience       5
Name: E, dtype: object

As observed, we have a similar output as above.

The Complete example is as follows,

import pandas as pd

# List of Tuples
employees= [('Shubham', 'India', 'Tech',   5, 4),
            ('Riti', 'India', 'Design' ,   7, 7),
            ('Shanky', 'India', 'PMO' ,   2, 2),
            ('Shreya', 'India', 'Design' ,   2, 0),
            ('Aadi', 'US', 'PMO', 11, 5),
            ('Sim', 'US', 'Tech', 4, 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Location', 'Team', 'Experience', 'RelevantExperience'],
                  index = ['A', 'B', 'C', 'D', 'E', 'F'])
print(df)

# get row with highest experience
print(df.nlargest(1, 'Experience'))

# get the maximum value 
print (df['Experience'].max())

# get row with maximum value
print (df[df['Experience'] == df['Experience'].max()])

# get row with highest experience
print (df.iloc[df['Experience'].argmax()])

# get row with highest experience
print (df.loc[df['Experience'].idxmax()])

Summary

In this article, we have discussed how to find the maximum value of a column and return the corresponding rows in Pandas. Thanks.

Advertisements

Thanks for reading.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top