Pandas: Get first N rows of dataframe

In this article, we will discuss different ways to get the first N rows of dataframe in pandas.

Get first N rows of dataframe using iloc[]

Before looking into the solution, let’s first have a summarized view of the dataframe’s iloc.

Overview of dataframe iloc[]

In Pandas, the dataframe class has an attribute iloc[] for location based indexing i.e.

dataframe.iloc[row_section, col_section]
dataframe.iloc[row_section]
  • row_section: It can be,
    • A row number
    • A list of row numbers
    • A range of row numbers like start:end i.e. inlcude rows from number start to end-1.
  • column_section: It can be
    • A column number
    • A column of row numbers
    • A range of column numbers like start:end i.e. inlcude column from number start to end-1.

It selects a portion of the dataframe based on the row & column numbers provided in these row & column sections. If you skip the column section and provide the row section only, then by default it will include all columns and returns the specified rows only (with all columns).

Advertisements

Get first N rows of pandas dataframe

To select the first n rows of the dataframe using iloc[], we can skip the column section and in row section pass a range of column numbers i.e. 0 to N. It will select the first N rows,

df.iloc[:N]

As indexing starts from 0, so we can avoid writing it too. If not provided, then iloc[] will consider 0 by default. So, it will give us first N rows of dataframe.

Complete example

Let’s see an example, where we will select and print the first 3 rows of a dataframe using iloc[],

import pandas as pd

# List of Tuples
employees = [('Jack',    34, 'Sydney',   5),
            ('Shaun',   31, 'Delhi' ,   7),
            ('Meera',   29, 'Tokyo' ,   3),
            ('Mark',    33, 'London' ,  9),
            ('Shachin', 16, 'London',   3),
            ('Eva',     41, 'Delhi' ,   4)]

# Create a DataFrame object
df = pd.DataFrame(  employees, 
                    columns=['Name', 'Age', 'City', 'Experience'])

print("Contents of the Dataframe : ")
print(df)

N = 3
# Select first N rows of the dataframe as a dataframe object
first_n_rows = df.iloc[:N]

print("First N rows Of Dataframe: ")
print(first_n_rows)

Output:

Contents of the Dataframe : 
      Name  Age    City  Experience
0     Jack   34  Sydney           5
1    Shaun   31   Delhi           7
2    Meera   29   Tokyo           3
3     Mark   33  London           9
4  Shachin   16  London           3
5      Eva   41   Delhi           4

First N rows Of Dataframe: 
    Name  Age    City  Experience
0   Jack   34  Sydney           5
1  Shaun   31   Delhi           7
2  Meera   29   Tokyo           3

We selected the first three rows of the dataframe as a dataframe and printed it.

Learn More

Get first N rows of a dataframe using head()

In Pandas, the dataframe provides a function head(n). It returns the first N rows of dataframe. We can use it to get only the first n row of the dataframe,

df.head(N)

It will return the first n rows of dataframe as a dataframe object.

Let’s see a complete example,

import pandas as pd

# List of Tuples
employees = [('Jack',    34, 'Sydney',   5),
            ('Shaun',   31, 'Delhi' ,   7),
            ('Meera',   29, 'Tokyo' ,   3),
            ('Mark',    33, 'London' ,  9),
            ('Shachin', 16, 'London',   3),
            ('Eva',     41, 'Delhi' ,   4)]

# Create a DataFrame object
df = pd.DataFrame(  employees, 
                    columns=['Name', 'Age', 'City', 'Experience'])

print("Contents of the Dataframe : ")
print(df)

N = 3
# Select first N rows of the dataframe 
first_n_rows = df.head(N)

print("First N rows Of Dataframe: ")
print(first_n_rows)

Output:

Contents of the Dataframe : 
      Name  Age    City  Experience
0     Jack   34  Sydney           5
1    Shaun   31   Delhi           7
2    Meera   29   Tokyo           3
3     Mark   33  London           9
4  Shachin   16  London           3
5      Eva   41   Delhi           4

First N rows Of Dataframe: 
    Name  Age    City  Experience
0   Jack   34  Sydney           5
1  Shaun   31   Delhi           7
2  Meera   29   Tokyo           3

Using the head() function, we fetched the first 3 rows of dataframe as a dataframe and then just printed it.

Get first N rows of dataframe with specific columns

Suppose we are want the first 3 rows of dataframe but it should include only 2 specified columns. lets see how to do that,

import pandas as pd

# List of Tuples
employees = [('Jack',    34, 'Sydney',   5),
            ('Shaun',   31, 'Delhi' ,   7),
            ('Meera',   29, 'Tokyo' ,   3),
            ('Mark',    33, 'London' ,  9),
            ('Shachin', 16, 'London',   3),
            ('Eva',     41, 'Delhi' ,   4)]

# Create a DataFrame object
df = pd.DataFrame(  employees, 
                    columns=['Name', 'Age', 'City', 'Experience'])

print("Contents of the Dataframe : ")
print(df)

N = 3
# Select first N rows of the dataframe 
first_n_rows = df[['Name', 'City']].head(N)

print("First N rows Of Dataframe: ")
print(first_n_rows)

Output:

Contents of the Dataframe : 
      Name  Age    City  Experience
0     Jack   34  Sydney           5
1    Shaun   31   Delhi           7
2    Meera   29   Tokyo           3
3     Mark   33  London           9
4  Shachin   16  London           3
5      Eva   41   Delhi           4

First N rows Of Dataframe: 
    Name    City
0   Jack  Sydney
1  Shaun   Delhi
2  Meera   Tokyo

We first selected two columns of the dataframe i.e. Name & City as a dataframe object and then we called the head(3) function on that to select first 3 enteries of that dataframe.

Summary:

We learned about different ways to get the first N rows of dataframe in pandas.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top