In this article, we will discuss different ways to get the first N rows of dataframe in pandas.
Get first N rows of dataframe using iloc[]
Before looking into the solution, let’s first have a summarized view of the dataframe’s iloc.
Overview of dataframe iloc[]
In Pandas, the dataframe class has an attribute iloc[] for location based indexing i.e.
dataframe.iloc[row_section, col_section] dataframe.iloc[row_section]
- row_section: It can be,
- A row number
- A list of row numbers
- A range of row numbers like start:end i.e. inlcude rows from number start to end-1.
- column_section: It can be
- A column number
- A column of row numbers
- A range of column numbers like start:end i.e. inlcude column from number start to end-1.
It selects a portion of the dataframe based on the row & column numbers provided in these row & column sections. If you skip the column section and provide the row section only, then by default it will include all columns and returns the specified rows only (with all columns).
Get first N rows of pandas dataframe
To select the first n rows of the dataframe using iloc[], we can skip the column section and in row section pass a range of column numbers i.e. 0 to N. It will select the first N rows,
df.iloc[:N]
As indexing starts from 0, so we can avoid writing it too. If not provided, then iloc[] will consider 0 by default. So, it will give us first N rows of dataframe.
Frequently Asked:
- Drop last row of pandas dataframe in python (3 ways)
- Drop Duplicate Rows from Pandas Dataframe
- Pandas: Add two columns into a new column in Dataframe
- How to read a large CSV file with pandas?
Complete example
Let’s see an example, where we will select and print the first 3 rows of a dataframe using iloc[],
import pandas as pd # List of Tuples employees = [('Jack', 34, 'Sydney', 5), ('Shaun', 31, 'Delhi' , 7), ('Meera', 29, 'Tokyo' , 3), ('Mark', 33, 'London' , 9), ('Shachin', 16, 'London', 3), ('Eva', 41, 'Delhi' , 4)] # Create a DataFrame object df = pd.DataFrame( employees, columns=['Name', 'Age', 'City', 'Experience']) print("Contents of the Dataframe : ") print(df) N = 3 # Select first N rows of the dataframe as a dataframe object first_n_rows = df.iloc[:N] print("First N rows Of Dataframe: ") print(first_n_rows)
Output:
Contents of the Dataframe : Name Age City Experience 0 Jack 34 Sydney 5 1 Shaun 31 Delhi 7 2 Meera 29 Tokyo 3 3 Mark 33 London 9 4 Shachin 16 London 3 5 Eva 41 Delhi 4 First N rows Of Dataframe: Name Age City Experience 0 Jack 34 Sydney 5 1 Shaun 31 Delhi 7 2 Meera 29 Tokyo 3
We selected the first three rows of the dataframe as a dataframe and printed it.
Learn More
Get first N rows of a dataframe using head()
In Pandas, the dataframe provides a function head(n). It returns the first N rows of dataframe. We can use it to get only the first n row of the dataframe,
df.head(N)
It will return the first n rows of dataframe as a dataframe object.
Let’s see a complete example,
import pandas as pd # List of Tuples employees = [('Jack', 34, 'Sydney', 5), ('Shaun', 31, 'Delhi' , 7), ('Meera', 29, 'Tokyo' , 3), ('Mark', 33, 'London' , 9), ('Shachin', 16, 'London', 3), ('Eva', 41, 'Delhi' , 4)] # Create a DataFrame object df = pd.DataFrame( employees, columns=['Name', 'Age', 'City', 'Experience']) print("Contents of the Dataframe : ") print(df) N = 3 # Select first N rows of the dataframe first_n_rows = df.head(N) print("First N rows Of Dataframe: ") print(first_n_rows)
Output:
Contents of the Dataframe : Name Age City Experience 0 Jack 34 Sydney 5 1 Shaun 31 Delhi 7 2 Meera 29 Tokyo 3 3 Mark 33 London 9 4 Shachin 16 London 3 5 Eva 41 Delhi 4 First N rows Of Dataframe: Name Age City Experience 0 Jack 34 Sydney 5 1 Shaun 31 Delhi 7 2 Meera 29 Tokyo 3
Using the head() function, we fetched the first 3 rows of dataframe as a dataframe and then just printed it.
Get first N rows of dataframe with specific columns
Suppose we are want the first 3 rows of dataframe but it should include only 2 specified columns. lets see how to do that,
import pandas as pd # List of Tuples employees = [('Jack', 34, 'Sydney', 5), ('Shaun', 31, 'Delhi' , 7), ('Meera', 29, 'Tokyo' , 3), ('Mark', 33, 'London' , 9), ('Shachin', 16, 'London', 3), ('Eva', 41, 'Delhi' , 4)] # Create a DataFrame object df = pd.DataFrame( employees, columns=['Name', 'Age', 'City', 'Experience']) print("Contents of the Dataframe : ") print(df) N = 3 # Select first N rows of the dataframe first_n_rows = df[['Name', 'City']].head(N) print("First N rows Of Dataframe: ") print(first_n_rows)
Output:
Contents of the Dataframe : Name Age City Experience 0 Jack 34 Sydney 5 1 Shaun 31 Delhi 7 2 Meera 29 Tokyo 3 3 Mark 33 London 9 4 Shachin 16 London 3 5 Eva 41 Delhi 4 First N rows Of Dataframe: Name City 0 Jack Sydney 1 Shaun Delhi 2 Meera Tokyo
We first selected two columns of the dataframe i.e. Name & City as a dataframe object and then we called the head(3) function on that to select first 3 enteries of that dataframe.
Summary:
We learned about different ways to get the first N rows of dataframe in pandas.