In this article we will discuss how to sort the contents of dataframe based on column names or row index labels using Dataframe.sort_index().
Dataframe.sort_index()
In Python’s Pandas Library, Dataframe class provides a member function sort_index() to sort a DataFrame based on label names along the axis i.e.
DataFrame.sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, by=None)
Important arguments are,
- axis : If axis is 0, then dataframe will sorted based on row index labels. Default is 0
- If axis is 1, then dataframe will sorted based on column names.
- ascending : If True sort in ascending else sort in descending order. Default is True
- inplace : If True, perform operation in-place in Dataframe
- na_position : Decides the position of NaNs after sorting i.e. irst puts NaNs at the beginning, last puts NaNs at the end
Default value is ‘first’
It returns a sorted dataframe object. Also, if inplace argument is not True then it will return a sorted copy of given dataframe, instead of modifying the original Dataframe. Whereas, if inplace argument is True then it will make the current dataframe sorted.
Let’s understand by some examples,
First of all create a Dataframe object i.e.
Frequently Asked:
- Pandas: Select multiple columns of dataframe by name
- Replace NaN with 0 in Pandas DataFrame
- Export Pandas Dataframe to JSON
- Add Column with random values in Pandas DataFrame
# List of Tuples students = [ ('Jack', 34, 'Sydney') , ('Riti', 31, 'Delhi' ) , ('Aadi', 16, 'New York') , ('Riti', 32, 'Delhi' ) , ('Riti', 33, 'Delhi' ) , ('Riti', 35, 'Mumbai' ) ] # Create a DataFrame object dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
Contents of the created dataframe are,
Name Marks City b Jack 34 Sydney a Riti 31 Delhi f Aadi 16 New York e Riti 32 Delhi d Riti 33 Delhi c Riti 35 Mumbai
Now let’s see how to sort this DataFrame based on labels i.e. either column or row index labels,
Sort rows of a Dataframe based on Row index labels
To sort based on row index labels we can call sort_index() on the dataframe object
# sort the rows of dataframe based on row index label names modDFObj = dfObj.sort_index() print('Contents of Dataframe sorted based on Row Index Labels are :') print(modDFObj)
Output:
Contents of Dataframe sorted based on Row Index Labels are : Name Marks City a Riti 31 Delhi b Jack 34 Sydney c Riti 35 Mumbai d Riti 33 Delhi e Riti 32 Delhi f Aadi 16 New York
As we can see in the output rows are sorted based on the index labels now. Instead of modifying the original dataframe it returned a sorted copy of dataframe.
Sort rows of a Dataframe in Descending Order based on Row index labels
To sort based on row index labels in descending order we need to pass argument ascending=False in sort_index() function on the dataframe object,
# sort the rows of dataframe in descending order based on row index label names modDFObj = dfObj.sort_index(ascending=False) print('Contents of Dataframe sorted in Descending Order based on Row Index Labels are :') print(modDFObj)
Output:
Contents of Dataframe sorted in Descending Order based on Row Index Labels are : Name Marks City f Aadi 16 New York e Riti 32 Delhi d Riti 33 Delhi c Riti 35 Mumbai b Jack 34 Sydney a Riti 31 Delhi
As we can see in the output rows are sorted in descedning order based on the index labels now. Also, instead of modifying the original dataframe it returned a sorted copy of dataframe.
Sort rows of a Dataframe based on Row index labels in Place
To sort a dataframe inplace instead of getting a sorted copy pass argument inplace=True in sort_index() function on the dataframe object to sort the dataframe by row index labels inplace i.e.
# sort the rows of dataframe in Place based on row index label names dfObj.sort_index(inplace=True) print('Contents of Dataframe sorted in Place based on Row Index Labels are :') print(dfObj)
Output:
Contents of Dataframe sorted in Place based on Row Index Labels are : Name Marks City a Riti 31 Delhi b Jack 34 Sydney c Riti 35 Mumbai d Riti 33 Delhi e Riti 32 Delhi f Aadi 16 New York
As we can see in the output rows of the dataframe are sorted in place.
Sort Columns of a Dataframe based on Column Names
To sort a DataFrame based on column names we can call sort_index() on the DataFrame object with argument axis=1 i.e.
# sort a dataframe based on column names modDfObj = dfObj.sort_index(axis=1) print('Contents of Dataframe sorted based on Column Names are :') print(modDfObj)
Output:
Contents of Dataframe sorted based on Column Names are : City Marks Name a Delhi 31 Riti b Sydney 34 Jack c Mumbai 35 Riti d Delhi 33 Riti e Delhi 32 Riti f New York 16 Aadi
As we can see, instead of modifying the original dataframe it returned a sorted copy of dataframe based on column names.
Sort Columns of a Dataframe in Descending Order based on Column Names
To sort a DataFrame based on column names in descending Order, we can call sort_index() on the DataFrame object with argument axis=1 and ascending=False i.e.
# sort a dataframe in descending order based on column names modDfObj = dfObj.sort_index(ascending=False, axis=1) print('Contents of Dataframe sorted in Descending Order based on Column Names are :') print(modDfObj)
Output:
Contents of Dataframe sorted in Descending Order based on Column Names are : Name Marks City a Riti 31 Delhi b Jack 34 Sydney c Riti 35 Mumbai d Riti 33 Delhi e Riti 32 Delhi f Aadi 16 New York
Instead of modifying the original dataframe it returned a sorted copy of dataframe based on column names ( sorted in descending Order)
Sort Columns of a Dataframe in Place based on Column Names
To sort a dataframe inplace instead of getting a sorted copy pass arguments inplace=True and axis=1 in sort_index() function on the dataframe object to sort the dataframe inplace by column names i.e.
# sort a dataframe in place based on column names dfObj.sort_index(inplace=True, axis=1) print('Contents of Dataframe sorted in Place based on Column Names are :') print(dfObj)
Output:
Contents of Dataframe sorted in Place based on Column Names are : City Marks Name a Delhi 31 Riti b Sydney 34 Jack c Mumbai 35 Riti d Delhi 33 Riti e Delhi 32 Riti f New York 16 Aadi
As we can see in the output rows of the dataframe are sorted in place.
For sorting based on contents of a Dataframe look at the following article,
Pandas: Sort rows or columns in Dataframe based on values using Dataframe.sort_values()
Complete example is as follows,
import pandas as pd def main(): # List of Tuples students = [ ('Jack', 34, 'Sydney') , ('Riti', 31, 'Delhi' ) , ('Aadi', 16, 'New York') , ('Riti', 32, 'Delhi' ) , ('Riti', 33, 'Delhi' ) , ('Riti', 35, 'Mumbai' ) ] # Create a DataFrame object dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c']) print("Original Dataframe : ") print(dfObj) print('***** Sort rows of a Dataframe based on Row index labels ***** ') # sort the rows of dataframe based on row index label names modDFObj = dfObj.sort_index() print('Contents of Dataframe sorted based on Row Index Labels are :') print(modDFObj) print('***** Sort rows of a Dataframe in Descending Order based on Row index labels ***** ') # sort the rows of dataframe in descending order based on row index label names modDFObj = dfObj.sort_index(ascending=False) print('Contents of Dataframe sorted in Descending Order based on Row Index Labels are :') print(modDFObj) print('***** Sort rows of a Dataframe based on Row index labels in Place ***** ') # sort the rows of dataframe in Place based on row index label names dfObj.sort_index(inplace=True) print('Contents of Dataframe sorted in Place based on Row Index Labels are :') print(dfObj) print('***** Sort a Dataframe based on Column Names ***** ') # sort a dataframe based on column names modDfObj = dfObj.sort_index(axis=1) print('Contents of Dataframe sorted based on Column Names are :') print(modDfObj) print('***** Sort a Dataframe in Descending Order based on Column Names ***** ') # sort a dataframe in descending order based on column names modDfObj = dfObj.sort_index(ascending=False, axis=1) print('Contents of Dataframe sorted in Descending Order based on Column Names are :') print(modDfObj) print('***** Sort a Dataframe in Place based on Column Names ***** ') # sort a dataframe in place based on column names dfObj.sort_index(inplace=True, axis=1) print('Contents of Dataframe sorted in Place based on Column Names are :') print(dfObj) if __name__ == '__main__': main()
Output:
Original Dataframe : Name Marks City b Jack 34 Sydney a Riti 31 Delhi f Aadi 16 New York e Riti 32 Delhi d Riti 33 Delhi c Riti 35 Mumbai ***** Sort rows of a Dataframe based on Row index labels ***** Contents of Dataframe sorted based on Row Index Labels are : Name Marks City a Riti 31 Delhi b Jack 34 Sydney c Riti 35 Mumbai d Riti 33 Delhi e Riti 32 Delhi f Aadi 16 New York ***** Sort rows of a Dataframe in Descending Order based on Row index labels ***** Contents of Dataframe sorted in Descending Order based on Row Index Labels are : Name Marks City f Aadi 16 New York e Riti 32 Delhi d Riti 33 Delhi c Riti 35 Mumbai b Jack 34 Sydney a Riti 31 Delhi ***** Sort rows of a Dataframe based on Row index labels in Place ***** Contents of Dataframe sorted in Place based on Row Index Labels are : Name Marks City a Riti 31 Delhi b Jack 34 Sydney c Riti 35 Mumbai d Riti 33 Delhi e Riti 32 Delhi f Aadi 16 New York ***** Sort a Dataframe based on Column Names ***** Contents of Dataframe sorted based on Column Names are : City Marks Name a Delhi 31 Riti b Sydney 34 Jack c Mumbai 35 Riti d Delhi 33 Riti e Delhi 32 Riti f New York 16 Aadi ***** Sort a Dataframe in Descending Order based on Column Names ***** Contents of Dataframe sorted in Descending Order based on Column Names are : Name Marks City a Riti 31 Delhi b Jack 34 Sydney c Riti 35 Mumbai d Riti 33 Delhi e Riti 32 Delhi f Aadi 16 New York ***** Sort a Dataframe in Place based on Column Names ***** Contents of Dataframe sorted in Place based on Column Names are : City Marks Name a Delhi 31 Riti b Sydney 34 Jack c Mumbai 35 Riti d Delhi 33 Riti e Delhi 32 Riti f New York 16 Aadi