This tutorial will discuss about different ways to select DataFrame Rows by index names in pandas.
Table Of Contents
Preparing DataSet
Let’s create a DataFrame with some dummy data.
import pandas as pd data = {'Col_A': [21, 12, 13, 14, 15, 16, 17], 'Col_B': [21, 22, 23, 24, 25, 26, 27], 'Col_C': [31, 32, 33, 34, 35, 36, 37]} index=["D1", "D2", "D3", "D4", "D5", "D6", "D7"] # Create DataFrame from dictionary df = pd.DataFrame.from_dict(data) # Set list index as Index of DataFrame df.set_index(pd.Index(index), inplace=True) print (df)
Output
Col_A Col_B Col_C D1 21 21 31 D2 12 22 32 D3 13 23 33 D4 14 24 34 D5 15 25 35 D6 16 26 36 D7 17 27 37
We will now select rows from this DataFrame by Index label or names.
Select a Row by Index Label or Name
In Pandas, the DataFrame provides an attribute loc[]
which accepts index labels/names as an argument and returns the rows with this given index labels.
To select a row by index name, pass the index/row name in the loc[]
attribute and it will return a subset of DataFrame, containing only those rows which has the given index name.
Frequently Asked:
Let’s see an example where we will select all the rows with label D2
from a DataFrame.
# Select row with index label 'D2' subDf = df.loc["D2"] print (subDf)
Output
Col_A 12 Col_B 22 Col_C 32 Name: D2, dtype: int64
As DataFrame had only one row with label “D2”, therefore it returned a Series object. If DataFrame has multiple rows with the given Index Name, then it will return a DataFrame containing only those rows.
Select Multiple Rows by Index Label or Names
To select multiple rows from a DataFrame by index names, we can pass a list of names in the loc[]
attribute of DataFrame, and it will return a subset of DataFrame containing only those rows whose index names matches with the index names in the list.
In the below example, we will select rows from DataFrame, with index label D2
, D4
and D7
.
# Select rows at index label 'D2', 'D4' and 'D7' subDf = df.loc[["D2", "D4", "D7"]] print (subDf)
Output
Col_A Col_B Col_C D2 12 22 32 D4 14 24 34 D7 17 27 37
Select Multiple Rows by Index Label Range
We can also select multiple rows from a DataFrame by the index name range. By the Index name/label range, we can select all the rows from a given index name till another index name. For that, we need to pass both the star
and end
Index names of rows, separated by a colon, in the loc[]
attribute. Like,
df.loc[startLabel : endLabel]
In the below example, we will select all rows from a DataFrame from Index Label D2
till the Index label D6
. So, basically it will select the rows D2
, D3
, D4
, D5
, add D6
from the DataFrame.
# Select rows from label D2 till D6 subDf = df.loc["D2":"D6"] print (subDf)
Output
Col_A Col_B Col_C D2 12 22 32 D3 13 23 33 D4 14 24 34 D5 15 25 35 D6 16 26 36
Summary
We learned about different ways to select rows from a DataFrame by index names. Thanks.