In this article, we will discuss different ways to delete last N rows of a dataframe in python.
Use iloc to drop last N rows of pandas dataframe
In Pandas, the Dataframe provides an attribute iloc to select a portion of the dataframe using position based indexing. This selected portion can be a few columns or rows . We can use this attribute to select all the rows except last N rows of a dataframe and then assign back that to the original variable. It will give an effect that we have deleted the last N rows from the dataframe. For example,
# Drop last 3 rows # by selecting all rows except last 3 rows N = 3 df = df.iloc[:-N , :]
We selected a portion of dataframe, that included all columns, but it selected only first (size – N) rows. Then assigned this back to the same variable. So, basically it removed the last N rows of dataframe.
How did it work?
The syntax of dataframe.iloc[] is like,
df.iloc[row_start:row_end , col_start, col_end]
- row_start: The row index/position from where it should start selection. Default is 0.
- row_end: The row index/position from where it should end the selection i.e. select till row_end-1. Default is till the last row of the dataframe.
- col_start: The column index/position from where it should start selection. Default is 0.
- col_end: The column index/position from where it should end the selection i.e. select till col_end-1. Default is till the last column of the dataframe.
It returns a portion of dataframe that includes rows from row_start to row_end-1 and columns from col_start to col_end-1.
To delete the last N rows of the dataframe, just select the rows from row number 0 till the end -N ( with negative indexing it is -N ) and select all columns i.e.
Frequently Asked:
df = df.iloc[:-N , :]
Checkout complete example to delete the last 3 rows of dataframe,
import pandas as pd ''' Using iloc[] ''' # List of Tuples empoyees = [('Jack', 34, 'Sydney', 5), ('Riti', 31, 'Delhi' , 7), ('Aadi', 16, 'London', 11), ('Mark', 41, 'Delhi' , 12), ('Sam', 56, 'London', 33)] # Create a DataFrame object df = pd.DataFrame( empoyees, columns=['Name', 'Age', 'City', 'Experience'], index = ['A', 'B', 'C', 'D', 'E']) print("Contents of the Dataframe : ") print(df) # Drop last 3 rows # by selecting all rows except last 3 rows N = 3 df = df.iloc[:-N , :] print("Modified Dataframe : ") print(df)
Output:
Contents of the Dataframe : Name Age City Experience A Jack 34 Sydney 5 B Riti 31 Delhi 7 C Aadi 16 London 11 D Mark 41 Delhi 12 E Sam 56 London 33 Modified Dataframe : Name Age City Experience A Jack 34 Sydney 5 B Riti 31 Delhi 7
Use drop() to remove last N rows of pandas dataframe
In pandas, the dataframe’s drop() function accepts a sequence of row names that it needs to delete from the dataframe. To make sure that it removes the rows only, use argument axis=0 and to make changes in place i.e. in calling dataframe object, pass argument inplace=True.
Checkout complete example to delete the last 3 rows of dataframe,
import pandas as pd # List of Tuples empoyees = [('Jack', 34, 'Sydney', 5), ('Riti', 31, 'Delhi' , 7), ('Aadi', 16, 'London', 11), ('Mark', 41, 'Delhi' , 12), ('Sam', 56, 'London', 33)] # Create a DataFrame object df = pd.DataFrame( empoyees, columns=['Name', 'Age', 'City', 'Experience'], index = ['A', 'B', 'C', 'D', 'E']) print("Contents of the Dataframe : ") print(df) # Drop last 3 rows of dataframe N = 3 df.drop(index=df.index[-N:], axis=0, inplace=True) print("Modified Dataframe : ") print(df)
Output:
Contents of the Dataframe : Name Age City Experience A Jack 34 Sydney 5 B Riti 31 Delhi 7 C Aadi 16 London 11 D Mark 41 Delhi 12 E Sam 56 London 33 Modified Dataframe : Name Age City Experience A Jack 34 Sydney 5 B Riti 31 Delhi 7
We fetched the row names of dataframe as a sequence and passed the last N row names ( df.index[-N:] ) as the index argument in drop() function, therefore it deleted the last N rows (3 rows) of dataframe.
Use head() to remove last N rows of pandas dataframe
In Pandas, dataframe provides a function head(N) to select first N rows of dataframe. To delete last N rows of dataframe, we can select first (Size-N) rows of dataframe using head() function. For example,
import pandas as pd # List of Tuples empoyees = [('Jack', 34, 'Sydney', 5), ('Riti', 31, 'Delhi' , 7), ('Aadi', 16, 'London', 11), ('Mark', 41, 'Delhi' , 12), ('Sam', 56, 'London', 33)] # Create a DataFrame object df = pd.DataFrame( empoyees, columns=['Name', 'Age', 'City', 'Experience'], index = ['A', 'B', 'C', 'D', 'E']) print("Contents of the Dataframe : ") print(df) # Drop last 3 rows of dataframe N = 3 df = df.head(df.shape[0] -N) print("Modified Dataframe : ") print(df)
Output:
Contents of the Dataframe : Name Age City Experience A Jack 34 Sydney 5 B Riti 31 Delhi 7 C Aadi 16 London 11 D Mark 41 Delhi 12 E Sam 56 London 33 Modified Dataframe : Name Age City Experience A Jack 34 Sydney 5 B Riti 31 Delhi 7
It removed the last 3 rows of dataframe in place.
Summary:
We learned about four different ways to delete last N rows of a dataframe.