In this article we will discuss how to sort rows in ascending and descending order based on values in a single or multiple columns . Also, how to sort columns based on values in rows using DataFrame.sort_values()
DataFrame.sort_values()
In Python’s Pandas library, Dataframe class provides a member function to sort the content of dataframe i.e.
DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')
Arguments :
- by : A string or list of strings basically either column names or index labels based on which sorting will be done.
- axis : If axis is 0, then name or list of names in by argument will be considered as column names. Default is 0
- If axis is 1, then name or list of names in by argument will be considered as row index labels
- ascending : If True sort in ascending else sort in descending order. Default is True
- inplace : If True, perform operation in-place in Dataframe
- na_position : Decides the position of NaNs after sorting i.e. irst puts NaNs at the beginning, last puts NaNs at the end
Default value is ‘first’
It returns a sorted dataframe object. Also, if inplace argument is not True then it will return a sorted copy of given dataframe, instead of modifying the original Dataframe. Whereas, if inplace argument is True then it will make the current dataframe sorted.
Frequently Asked:
Let’s understand by some examples,
First of all create a Dataframe object i.e.
# List of Tuples students = [ ('Jack', 34, 'Sydney') , ('Riti', 41, 'Delhi' ) , ('Aadi', 16, 'New York') , ('Riti', 22, 'Delhi' ) , ('Riti', 35, 'Delhi' ) , ('Riti', 40, 'Mumbai' ) ] # Create a DataFrame object dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
Contents of the created dataframe object dfObj are,
Name Marks City b Jack 34 Sydney a Riti 41 Delhi f Aadi 16 New York e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai
Now let’s see how to sort this dataframe in different ways,
Latest Python - Video Tutorial
Sort Dataframe rows based on a single column
To sort all the rows in above datafarme based on a column ‘Name’, we are going to pass the column name in by argument i.e.
# Sort the rows of dataframe by column 'Name' dfObj = dfObj.sort_values(by ='Name' ) print("Contents of Sorted Dataframe based on a single column 'Name' : ") print(dfObj)
Output:
Contents of Sorted Dataframe based on a single column 'Name' : Name Marks City f Aadi 16 New York b Jack 34 Sydney a Riti 41 Delhi e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai
As we can see in the output, it sorted all the rows in dataframe based on column ‘Name’. Now if you will observe in deep you will found the for 4 rows column name has same value. Now what if we want to sort in such a ways values are same one column then use 2nd column for sorting those rows ?
Let’s see how to do that ..
Sort Dataframe rows based on a multiple columns
To sort all the rows in above datafarme based on two column i.e. ‘Name’ & ‘Marks’, we are going to pass the column names as list in by argument i.e.
dfObj = dfObj.sort_values(by =['Name', 'Marks']) print("Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : ") print(dfObj)
Output:
Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : Name Marks City f Aadi 16 New York b Jack 34 Sydney e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai a Riti 41 Delhi
Now it will first sort all the rows based on column ‘Name’. If some rows has same value in ‘Name’ column then it will sort those rows based on value in ‘Marks’ column.
Sort Dataframe rows based on columns in Descending Order
To sort all the rows in above datafarme based on columns in descending order pass argument ascending with value False along with by arguments i.e.
# Sort the rows of dataframe by column 'Name' in descending order dfObj = dfObj.sort_values(by ='Name' , ascending=False) print("Contents of Sorted Dataframe based on a column 'Name' in Descending Order : ") print(dfObj)
Output:
Contents of Sorted Dataframe based on a column 'Name' in Descending Order : Name Marks City e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai a Riti 41 Delhi b Jack 34 Sydney f Aadi 16 New York
It will sort all the rows in Dataframe based on column ‘Name’ in descending orders.
Sort Dataframe rows based on a column in Place
To sort all the rows in above datafarme based on a single columns in place pass an extra argument inplace with value True along with other arguments i.e.
# Sort the rows of dataframe by column 'Name' inplace dfObj.sort_values(by='Name' , inplace=True) print("Contents of Sorted Dataframe based on a single column 'Name' inplace: ") print(dfObj)
Output:
Contents of Sorted Dataframe based on a single column 'Name' inplace: Name Marks City f Aadi 16 New York b Jack 34 Sydney e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai a Riti 41 Delhi
It will sort all the rows in Dataframe based on column ‘Name’ in place.
Till now we sorted the dataframe rows based on columns what if we want to vice versa i.e.
Sort columns of a Dataframe based on a single or multiple rows
Suppose we a dataframe of numbers i.e.
# List of Tuples matrix = [(222, 16, 23), (333, 31, 11), (444, 34, 11), ] # Create a DataFrame object of 3X3 Matrix dfObj = pd.DataFrame(matrix, index=list('abc'))
Contents of the dataframe are,
0 1 2 a 222 16 23 b 333 31 11 c 444 34 11
Sort columns of a Dataframe based on a single row
To sort columns of this dataframe based on a single row pass the row index labels in by argument and axis=1 i.e.
dfObj = dfObj.sort_values(by ='b', axis=1) print("Contents of Sorted Dataframe based on a single row index label 'b' ") print(dfObj)
Output:
Contents of Sorted Dataframe based on a single row index label 'b' 2 1 0 a 23 16 222 b 11 31 333 c 11 34 444
So, all the columns in dataframe are sorted based on a single row with index label ‘b’.
Sort columns of a Dataframe in Descending Order based on a single row
To sort columns of this dataframe in descending order based on a single row pass argument ascending=False along with other arguments i.e.
# Sort columns of a dataframe in descending order based on a single row with index label 'b' dfObj = dfObj.sort_values(by='b', axis=1, ascending=False) print("Contents of Sorted Dataframe in descending order based on a single row index label 'b' ") print(dfObj)
Output:
Contents of Sorted Dataframe in descending order based on a single row index label 'b' 0 1 2 a 222 16 23 b 333 31 11 c 444 34 11
Sort columns of a Dataframe based on a multiple rows
To sort the columns in dataframe are sorted based on multiple rows with index labels ‘b’ & ‘c’ pass the list in by argument and axis=1 i.e.
# Sort columns of a dataframe based on a multiple row with index labels 'b' & 'c' dfObj = dfObj.sort_values(by =['b' , 'c' ], axis=1) print("Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' ") print(dfObj)
Output:
Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' 2 1 0 a 23 16 222 b 11 31 333 c 11 34 444
Complete example is as follows,
import pandas as pd def main(): # List of Tuples students = [ ('Jack', 34, 'Sydney') , ('Riti', 41, 'Delhi' ) , ('Aadi', 16, 'New York') , ('Riti', 22, 'Delhi' ) , ('Riti', 35, 'Delhi' ) , ('Riti', 40, 'Mumbai' ) ] # Create a DataFrame object dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c']) print("Original Dataframe : ") print(dfObj) print('**** Sort Dataframe rows based on a single column ****') # Sort the rows of dataframe by column 'Name' dfObj = dfObj.sort_values(by ='Name' ) print("Contents of Sorted Dataframe based on a single column 'Name' : ") print(dfObj) print('**** Sort Dataframe rows based on a multiple columns ****') dfObj = dfObj.sort_values(by =['Name', 'Marks']) print("Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : ") print(dfObj) print('**** Sort Dataframe rows based on a single column in Descending Order ****') # Sort the rows of dataframe by column 'Name' in descending order dfObj = dfObj.sort_values(by ='Name' , ascending=False) print("Contents of Sorted Dataframe based on a column 'Name' in Descending Order : ") print(dfObj) print('**** Sort Dataframe rows based on a single column in place ****') # Sort the rows of dataframe by column 'Name' inplace dfObj.sort_values(by='Name' , inplace=True) print("Contents of Sorted Dataframe based on a single column 'Name' inplace: ") print(dfObj) print('******** Sort columns of Dataframe based on a single or multiple rows ********') # List of Tuples matrix = [(222, 16, 23), (333, 31, 11), (444, 34, 11), ] # Create a DataFrame object of 3X3 Matrix dfObj = pd.DataFrame(matrix, index=list('abc')) print("Original Dataframe: ") print(dfObj) # Sort columns of a dataframe based on a single row with index label 'b' dfObj = dfObj.sort_values(by ='b', axis=1) print("Contents of Sorted Dataframe based on a single row index label 'b' ") print(dfObj) # Sort columns of a dataframe in descending order based on a single row with index label 'b' dfObj = dfObj.sort_values(by='b', axis=1, ascending=False) print("Contents of Sorted Dataframe in descending order based on a single row index label 'b' ") print(dfObj) # Sort columns of a dataframe based on a multiple row with index labels 'b' & 'c' dfObj = dfObj.sort_values(by =['b' , 'c' ], axis=1) print("Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' ") print(dfObj) if __name__ == '__main__': main()
Output:
Original Dataframe : Name Marks City b Jack 34 Sydney a Riti 41 Delhi f Aadi 16 New York e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai **** Sort Dataframe rows based on a single column **** Contents of Sorted Dataframe based on a single column 'Name' : Name Marks City f Aadi 16 New York b Jack 34 Sydney a Riti 41 Delhi e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai **** Sort Dataframe rows based on a multiple columns **** Contents of a Sorted Dataframe based on multiple columns 'Name' & 'Marks' : Name Marks City f Aadi 16 New York b Jack 34 Sydney e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai a Riti 41 Delhi **** Sort Dataframe rows based on a single column in Descending Order **** Contents of Sorted Dataframe based on a column 'Name' in Descending Order : Name Marks City e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai a Riti 41 Delhi b Jack 34 Sydney f Aadi 16 New York **** Sort Dataframe rows based on a single column in place **** Contents of Sorted Dataframe based on a single column 'Name' inplace: Name Marks City f Aadi 16 New York b Jack 34 Sydney e Riti 22 Delhi d Riti 35 Delhi c Riti 40 Mumbai a Riti 41 Delhi ******** Sort columns of Dataframe based on a single or multiple rows ******** Original Dataframe: 0 1 2 a 222 16 23 b 333 31 11 c 444 34 11 Contents of Sorted Dataframe based on a single row index label 'b' 2 1 0 a 23 16 222 b 11 31 333 c 11 34 444 Contents of Sorted Dataframe in descending order based on a single row index label 'b' 0 1 2 a 222 16 23 b 333 31 11 c 444 34 11 Contents of Sorted Dataframe based on multiple rows index label 'b' & 'c' 2 1 0 a 23 16 222 b 11 31 333 c 11 34 444
Latest Video Tutorials