Pandas: Sort rows or columns in Dataframe based on values using Dataframe.sort_values()

In this article we will discuss how to sort rows in ascending and descending order based on values in a single or multiple columns . Also, how to sort columns based on values in rows using DataFrame.sort_values()

DataFrame.sort_values()

In Python’s Pandas library, Dataframe class provides a member function to sort the content of dataframe i.e.

DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

Arguments :

  • by : A string or list of strings basically either column names or index labels based on which sorting will be done.
  • axis : If axis is 0, then name or list of names in by argument will be considered as column names. Default is 0
    • If axis is 1, then name or list of names in by argument will be considered as row index labels
  • ascending : If True sort in ascending else sort in descending order. Default is True
  • inplace : If True, perform operation in-place in Dataframe
  • na_position : Decides the position of NaNs after sorting i.e. irst puts NaNs at the beginning, last puts NaNs at the end
    Default value is ‘first’

It returns a sorted dataframe object. Also, if inplace argument is not True then it will return a sorted copy of given dataframe, instead of modifying the original Dataframe. Whereas, if inplace argument is True then it will make the current dataframe sorted.

Let’s understand by some examples,

First of all create a Dataframe object i.e.

# List of Tuples
students = [ ('Jack', 34, 'Sydney') ,
             ('Riti', 41, 'Delhi' ) ,
             ('Aadi', 16, 'New York') ,
             ('Riti', 22, 'Delhi' ) ,
             ('Riti', 35, 'Delhi' ) ,
             ('Riti', 40, 'Mumbai' )
              ]

# Create a DataFrame object
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])

Contents of the created dataframe object dfObj are,

   Name  Marks      City
b  Jack     34    Sydney
a  Riti     41     Delhi
f  Aadi     16  New York
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai

Now let’s see how to sort this dataframe in different ways,

Sort Dataframe rows based on a single column

To sort all the rows in above datafarme based on a column ‘Name’, we are going to pass the column name in by argument i.e.

# Sort the rows of dataframe by column 'Name'
dfObj = dfObj.sort_values(by ='Name' )

print("Contents of Sorted Dataframe based on a single column 'Name' : ")
print(dfObj)

Output:

Contents of Sorted Dataframe based on a single column 'Name' : 
   Name  Marks      City
f  Aadi     16  New York
b  Jack     34    Sydney
a  Riti     41     Delhi
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai

As we can see in the output, it sorted all the rows in dataframe based on column ‘Name’. Now if you will observe in deep you will found the for 4 rows column name has same value. Now what if we want to sort in such a ways values are same one column then use 2nd column for sorting those rows ?
Let’s see how to do that ..

Sort Dataframe rows based on a multiple columns

To sort all the rows in above datafarme based on two column i.e. ‘Name’ & ‘Marks’, we are going to pass the column names as list in by argument i.e.

dfObj = dfObj.sort_values(by =['Name', 'Marks'])

print("Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : ")
print(dfObj)

Output:

Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : 
   Name  Marks      City
f  Aadi     16  New York
b  Jack     34    Sydney
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
a  Riti     41     Delhi

Now it will first sort all the rows based on column ‘Name’. If some rows has same value in ‘Name’ column then it will sort those rows based on value in ‘Marks’ column.

Sort Dataframe rows based on columns in Descending Order

To sort all the rows in above datafarme based on columns in descending order pass argument ascending with value False along with by arguments i.e.

# Sort the rows of dataframe by column 'Name' in descending order
dfObj = dfObj.sort_values(by ='Name' , ascending=False)

print("Contents of Sorted Dataframe based on a column 'Name' in Descending Order : ")
print(dfObj)

Output:

Contents of Sorted Dataframe based on a column 'Name' in Descending Order : 
   Name  Marks      City
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
a  Riti     41     Delhi
b  Jack     34    Sydney
f  Aadi     16  New York

It will sort all the rows in Dataframe based on column ‘Name’ in descending orders.

Sort Dataframe rows based on a column in Place

To sort all the rows in above datafarme based on a single columns in place pass an extra argument inplace with value True along with other arguments i.e.

# Sort the rows of dataframe by column 'Name' inplace
dfObj.sort_values(by='Name' , inplace=True)

print("Contents of Sorted Dataframe based on a single column 'Name' inplace: ")
print(dfObj)

Output:

Contents of Sorted Dataframe based on a single column 'Name' inplace: 
   Name  Marks      City
f  Aadi     16  New York
b  Jack     34    Sydney
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
a  Riti     41     Delhi

It will sort all the rows in Dataframe based on column ‘Name’ in place.

Till now we sorted the dataframe rows based on columns what if we want to vice versa i.e.

Sort columns of a Dataframe based on a single or multiple rows

Suppose we a dataframe of numbers i.e.

# List of Tuples
matrix = [(222, 16, 23),
          (333, 31, 11),
          (444, 34, 11),
          ]

# Create a DataFrame object of 3X3 Matrix
dfObj = pd.DataFrame(matrix, index=list('abc'))

Contents of the dataframe are,

     0   1   2
a  222  16  23
b  333  31  11
c  444  34  11

Sort columns of a Dataframe based on a single row

To sort columns of this dataframe based on a single row pass the row index labels in by argument and axis=1 i.e.

dfObj = dfObj.sort_values(by ='b', axis=1)

print("Contents of Sorted Dataframe based on a single row index label 'b' ")
print(dfObj)

Output:

Contents of Sorted Dataframe based on a single row index label 'b' 
    2   1    0
a  23  16  222
b  11  31  333
c  11  34  444

So, all the columns in dataframe are sorted based on a single row with index label ‘b’.

Sort columns of a Dataframe in Descending Order based on a single row

To sort columns of this dataframe in descending order based on a single row pass argument ascending=False along with other arguments i.e.

# Sort columns of a dataframe in descending order based on a single row with index label 'b'
dfObj = dfObj.sort_values(by='b', axis=1, ascending=False)

print("Contents of Sorted Dataframe in descending order based on a single row index label 'b' ")
print(dfObj)

Output:

Contents of Sorted Dataframe in descending order based on a single row index label 'b' 
     0   1   2
a  222  16  23
b  333  31  11
c  444  34  11

Sort columns of a Dataframe based on a multiple rows

To sort the columns in dataframe are sorted based on multiple rows with index labels ‘b’ & ‘c’ pass the list in by argument and axis=1 i.e.

# Sort columns of a dataframe based on a multiple row with index labels 'b' & 'c'
dfObj = dfObj.sort_values(by =['b' , 'c' ], axis=1)

print("Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' ")
print(dfObj)

Output:

Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' 
    2   1    0
a  23  16  222
b  11  31  333
c  11  34  444

Complete example is as follows,

import pandas as pd

def main():

   # List of Tuples
   students = [ ('Jack', 34, 'Sydney') ,
                ('Riti', 41, 'Delhi' ) ,
                ('Aadi', 16, 'New York') ,
                ('Riti', 22, 'Delhi' ) ,
                ('Riti', 35, 'Delhi' ) ,
                ('Riti', 40, 'Mumbai' )
                 ]

   # Create a DataFrame object
   dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
   print("Original Dataframe : ")
   print(dfObj)

   print('**** Sort Dataframe rows based on a single column ****')

   # Sort the rows of dataframe by column 'Name'
   dfObj = dfObj.sort_values(by ='Name' )

   print("Contents of Sorted Dataframe based on a single column 'Name' : ")
   print(dfObj)

   print('**** Sort Dataframe rows based on a multiple columns ****')

   dfObj = dfObj.sort_values(by =['Name', 'Marks'])

   print("Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : ")
   print(dfObj)

   print('**** Sort Dataframe rows based on a single column in Descending Order ****')

   # Sort the rows of dataframe by column 'Name' in descending order
   dfObj = dfObj.sort_values(by ='Name' , ascending=False)

   print("Contents of Sorted Dataframe based on a column 'Name' in Descending Order : ")
   print(dfObj)

   print('**** Sort Dataframe rows based on a single column in place ****')

   # Sort the rows of dataframe by column 'Name' inplace
   dfObj.sort_values(by='Name' , inplace=True)

   print("Contents of Sorted Dataframe based on a single column 'Name' inplace: ")
   print(dfObj)

   print('******** Sort columns of Dataframe based on a single or multiple rows ********')

   # List of Tuples
   matrix = [(222, 16, 23),
             (333, 31, 11),
             (444, 34, 11),
             ]

   # Create a DataFrame object of 3X3 Matrix
   dfObj = pd.DataFrame(matrix, index=list('abc'))

   print("Original Dataframe: ")
   print(dfObj)

   # Sort columns of a dataframe based on a single row with index label 'b'
   dfObj = dfObj.sort_values(by ='b', axis=1)

   print("Contents of Sorted Dataframe based on a single row index label 'b' ")
   print(dfObj)

   # Sort columns of a dataframe in descending order based on a single row with index label 'b'
   dfObj = dfObj.sort_values(by='b', axis=1, ascending=False)

   print("Contents of Sorted Dataframe in descending order based on a single row index label 'b' ")
   print(dfObj)

   # Sort columns of a dataframe based on a multiple row with index labels 'b' & 'c'
   dfObj = dfObj.sort_values(by =['b' , 'c' ], axis=1)

   print("Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' ")
   print(dfObj)


if __name__ == '__main__':
   main()

Output:

Original Dataframe : 
   Name  Marks      City
b  Jack     34    Sydney
a  Riti     41     Delhi
f  Aadi     16  New York
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
**** Sort Dataframe rows based on a single column ****
Contents of Sorted Dataframe based on a single column 'Name' : 
   Name  Marks      City
f  Aadi     16  New York
b  Jack     34    Sydney
a  Riti     41     Delhi
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
**** Sort Dataframe rows based on a multiple columns ****
Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : 
   Name  Marks      City
f  Aadi     16  New York
b  Jack     34    Sydney
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
a  Riti     41     Delhi
**** Sort Dataframe rows based on a single column in Descending Order ****
Contents of Sorted Dataframe based on a column 'Name' in Descending Order : 
   Name  Marks      City
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
a  Riti     41     Delhi
b  Jack     34    Sydney
f  Aadi     16  New York
**** Sort Dataframe rows based on a single column in place ****
Contents of Sorted Dataframe based on a single column 'Name' inplace: 
   Name  Marks      City
f  Aadi     16  New York
b  Jack     34    Sydney
e  Riti     22     Delhi
d  Riti     35     Delhi
c  Riti     40    Mumbai
a  Riti     41     Delhi
******** Sort columns of Dataframe based on a single or multiple rows ********
Original Dataframe: 
     0   1   2
a  222  16  23
b  333  31  11
c  444  34  11
Contents of Sorted Dataframe based on a single row index label 'b' 
    2   1    0
a  23  16  222
b  11  31  333
c  11  34  444
Contents of Sorted Dataframe in descending order based on a single row index label 'b' 
     0   1   2
a  222  16  23
b  333  31  11
c  444  34  11
Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' 
    2   1    0
a  23  16  222
b  11  31  333
c  11  34  444

 

 

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top