Select Rows by Index Names in Pandas

This tutorial will discuss about different ways to select DataFrame Rows by index names in pandas.

Table Of Contents

Preparing DataSet

Let’s create a DataFrame with some dummy data.

import pandas as pd

data = {'Col_A': [21, 12, 13, 14, 15, 16, 17],
        'Col_B': [21, 22, 23, 24, 25, 26, 27],
        'Col_C': [31, 32, 33, 34, 35, 36, 37]}

index=["D1", "D2", "D3", "D4", "D5", "D6", "D7"]

# Create DataFrame from dictionary
df = pd.DataFrame.from_dict(data)

# Set list index as Index of DataFrame
df.set_index(pd.Index(index), inplace=True)

print (df)

Output

    Col_A  Col_B  Col_C
D1     21     21     31
D2     12     22     32
D3     13     23     33
D4     14     24     34
D5     15     25     35
D6     16     26     36
D7     17     27     37

We will now select rows from this DataFrame by Index label or names.

Select a Row by Index Label or Name

In Pandas, the DataFrame provides an attribute loc[] which accepts index labels/names as an argument and returns the rows with this given index labels.

To select a row by index name, pass the index/row name in the loc[] attribute and it will return a subset of DataFrame, containing only those rows which has the given index name.

Let’s see an example where we will select all the rows with label D2 from a DataFrame.

# Select row with index label 'D2'
subDf = df.loc["D2"]

print (subDf)

Output

Col_A    12
Col_B    22
Col_C    32
Name: D2, dtype: int64

As DataFrame had only one row with label “D2”, therefore it returned a Series object. If DataFrame has multiple rows with the given Index Name, then it will return a DataFrame containing only those rows.

Select Multiple Rows by Index Label or Names

To select multiple rows from a DataFrame by index names, we can pass a list of names in the loc[] attribute of DataFrame, and it will return a subset of DataFrame containing only those rows whose index names matches with the index names in the list.

In the below example, we will select rows from DataFrame, with index label D2, D4 and D7.

# Select rows at index label 'D2', 'D4' and 'D7'
subDf = df.loc[["D2", "D4", "D7"]]

print (subDf)

Output

    Col_A  Col_B  Col_C
D2     12     22     32
D4     14     24     34
D7     17     27     37

Select Multiple Rows by Index Label Range

We can also select multiple rows from a DataFrame by the index name range. By the Index name/label range, we can select all the rows from a given index name till another index name. For that, we need to pass both the star and end Index names of rows, separated by a colon, in the loc[] attribute. Like,

df.loc[startLabel : endLabel]

In the below example, we will select all rows from a DataFrame from Index Label D2 till the Index label D6. So, basically it will select the rows D2, D3, D4, D5, add D6 from the DataFrame.

# Select rows from label D2 till D6
subDf = df.loc["D2":"D6"]

print (subDf)

Output

    Col_A  Col_B  Col_C
D2     12     22     32
D3     13     23     33
D4     14     24     34
D5     15     25     35
D6     16     26     36

Summary

We learned about different ways to select rows from a DataFrame by index names. Thanks.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top