Pandas: Find maximum values & position in columns or rows of a Dataframe

In this article we will discuss how to find maximum value in rows & columns of a Dataframe and also it’s index position.

DataFrame.max()

Python’s Pandas Library provides a member function in Dataframe to find the maximum value along the axis i.e.

DataFrame.max(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)

Important Arguments:

  • axis : Axis along which maximumn elements will be searched. For along index it’s 0 whereas along columns it’s 1
  • skipna : (bool) If NaN or NULL to be skipped . Default is True i.e. if not provided it will be skipped.

It returns the maximum value along the given axis i.e. either in rows or columns.

Let’s use this to find the maximum value among rows and columns,

Suppose we have a Dataframe i.e.

# List of Tuples
matrix = [(22, 16, 23),
          (33, np.NaN, 11),
          (44, 34, 11),
          (55, 35, np.NaN),
          (66, 36, 13)
          ]

# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))

Contents of the dataframe object dfObj are,

    x     y     z
a  22  16.0  23.0
b  33   NaN  11.0
c  44  34.0  11.0
d  55  35.0   NaN
e  66  36.0  13.0

Get maximum values in every row & column of the Dataframe

Get maximum values of every column

To find maximum value of every column in DataFrame just call the max() member function with DataFrame object without any argument i.e.

# Get a series containing maximum value of each column
maxValuesObj = dfObj.max()

print('Maximum value in each column : ')
print(maxValuesObj)

Output:

Maximum value in each column : 
x    66.0
y    36.0
z    23.0
dtype: float64

It returned a series with column names as index label and maximum value of each column in values. Similarly we can find max value in every row too,

Get maximum values of every row

To find maximum value of every row in DataFrame just call the max() member function with DataFrame object with argument axis=1 i.e.

# Get a series containing maximum value of each row
maxValuesObj = dfObj.max(axis=1)

print('Maximum value in each row : ')
print(maxValuesObj)

Output:

Maximum value in each row : 
a    23.0
b    33.0
c    44.0
d    55.0
e    66.0
dtype: float64

It returned a series with row index label and maximum value of each row.

As we can see that it has skipped the NaN while finding the max value. We can include the NaN too if we want i.e.

Get maximum values of every column without skipping NaN

# Get a series containing maximum value of each column without skipping NaN
maxValuesObj = dfObj.max(skipna=False)

print('Maximum value in each column including NaN: ')
print(maxValuesObj)

output:

Maximum value in each column including NaN: 
x    66.0
y     NaN
z     NaN
dtype: float64

As we have passed the skipna=False in max() function, therefore it included the NaN to while searching for NaN. Also, if there is any NaN in the column then it will be considered as maximum value of that column.

Get maximum values of a single column or selected columns

To get the maximum value of a single column call the max() function by selecting single column from dataframe i.e.

# Get maximum value of a single column 'y'
maxValue = dfObj['y'].max()

print("Maximum value in column 'y': " , maxValue)

Output:

Maximum value in column 'y':  36.0

There is an another way too i.e.

# Get maximum value of a single column 'y'
maxValue = dfObj.max()['y']

It will give the same result.

Instead of passing a single column name we can pass the list of column names too for selecting maximum value from that only i.e.

# Get maximum value of a single column 'y'
maxValue = dfObj[['y', 'z']].max()

print("Maximum value in column 'y' & 'z': ")
print(maxValue)

Output:

Maximum value in column 'y' & 'z': 
y    36.0
z    23.0
dtype: float64

Get row index label or position of maximum values of every column

DataFrame.idxmax()

We got the maximum value of each column or row, but what if we want to know the exact index position in every column or row where this maximum value exists ? To get the index of maximum value of elements in row and columns, pandas library provides a function i.e.

DataFrame.idxmax(axis=0, skipna=True)

Based on the value provided in axis it will return the index position of maximum value along rows and columns.
Let’s see how to use that

Get row index label of Maximum value in every column

# get the index position of max values in every column
maxValueIndexObj = dfObj.idxmax()

print("Max values of columns are at row index position :")
print(maxValueIndexObj)

Output:

Max values of columns are at row index position :
x    e
y    e
z    a
dtype: object

It’s a series containing the column names as index and row index labels where the maximum value exists in that column.

Get Column names of Maximum value in every row

# get the column name of max values in every row
maxValueIndexObj = dfObj.idxmax(axis=1)

print("Max values of row are at following columns :")
print(maxValueIndexObj)

Output:

Max values of row are at following columns :
a    z
b    x
c    x
d    x
e    x
dtype: object

It’s a series containing the rows index labels as index and column names as values where the maximum value exists in that row.

Complete example is as follows,

import pandas as pd
import numpy as np

def main():

   # List of Tuples
   matrix = [(22, 16, 23),
             (33, np.NaN, 11),
             (44, 34, 11),
             (55, 35, np.NaN),
             (66, 36, 13)
             ]

   # Create a DataFrame object
   dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))

   print('Original Dataframe Contents :')
   print(dfObj)

   print('***** Get Maximum value in every column ***** ')

   # Get a series containing maximum value of each column
   maxValuesObj = dfObj.max()

   print('Maximum value in each column : ')
   print(maxValuesObj)

   print('***** Get Maximum value in every row ***** ')

   # Get a series containing maximum value of each row
   maxValuesObj = dfObj.max(axis=1)

   print('Maximum value in each row : ')
   print(maxValuesObj)


   print('***** Get Maximum value in every column without skipping NaN ***** ')

   # Get a series containing maximum value of each column without skipping NaN
   maxValuesObj = dfObj.max(skipna=False)

   print('Maximum value in each column including NaN: ')
   print(maxValuesObj)

   print('***** Get Maximum value in a single column ***** ')

   # Get maximum value of a single column 'y'
   maxValue = dfObj['y'].max()

   print("Maximum value in column 'y': " , maxValue)

   # Get maximum value of a single column 'y'
   maxValue = dfObj.max()['y']

   print("Maximum value in column 'y': " , maxValue)

   print('***** Get Maximum value in certain columns only ***** ')

   # Get maximum value of a single column 'y'
   maxValue = dfObj[['y', 'z']].max()

   print("Maximum value in column 'y' & 'z': ")
   print(maxValue)


   print('***** Get row index label of Maximum value in every column *****')

   # get the index position of max values in every column
   maxValueIndexObj = dfObj.idxmax()

   print("Max values of columns are at row index position :")
   print(maxValueIndexObj)


   print('***** Get Column name of Maximum value in every row *****')

   # get the column name of max values in every row
   maxValueIndexObj = dfObj.idxmax(axis=1)

   print("Max values of row are at following columns :")
   print(maxValueIndexObj)



if __name__ == '__main__':
   main()

Output:

Original Dataframe Contents :
    x     y     z
a  22  16.0  23.0
b  33   NaN  11.0
c  44  34.0  11.0
d  55  35.0   NaN
e  66  36.0  13.0
***** Get Maximum value in every column ***** 
Maximum value in each column : 
x    66.0
y    36.0
z    23.0
dtype: float64
***** Get Maximum value in every row ***** 
Maximum value in each row : 
a    23.0
b    33.0
c    44.0
d    55.0
e    66.0
dtype: float64
***** Get Maximum value in every column without skipping NaN ***** 
Maximum value in each column including NaN: 
x    66.0
C:\Users\varun\AppData\Local\Programs\Python\Python37-32\lib\site-packages\numpy\core\_methods.py:28: RuntimeWarning: invalid value encountered in reduce
y     NaN
z     NaN
dtype: float64
  return umr_maximum(a, axis, None, out, keepdims, initial)
***** Get Maximum value in a single column ***** 
Maximum value in column 'y':  36.0
Maximum value in column 'y':  36.0
***** Get Maximum value in certain columns only ***** 
Maximum value in column 'y' & 'z': 
y    36.0
z    23.0
dtype: float64
***** Get row index label of Maximum value in every column *****
Max values of columns are at row index position :
x    e
y    e
z    a
dtype: object
***** Get Column name of Maximum value in every row *****
Max values of row are at following columns :
a    z
b    x
c    x
d    x
e    x
dtype: object

 

 

1 thought on “Pandas: Find maximum values & position in columns or rows of a Dataframe”

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top