In this article we will discuss different ways to select rows and columns in DataFrame.

DataFrame provides indexing labels loc & iloc for accessing the column and rows. Also, operator [] can be used to select columns. Let’s discuss them one by one,

First create a DataFrame object i.e.

Contents of DataFrame object dfObj  are,

DataFrame.loc | Select Column & Rows by Name

DataFrame provides indexing label loc for selecting columns and rows by names i.e.

It selects the specified columns and rows from the given DataFrame.
ROWS OR COLUMN RANGE can be also be ‘:’  and if given in rows or column Range parameter then the all entries will be included for corresponding row or column.

Let’s see how to use it,

Select a Column by Name in DataFrame using loc[]

As we want selection on column only, it means all rows should be included for selected column i.e.

It will return a Series object with same indexes as DataFrame.

Select multiple Columns by Name in DataFrame using loc[]

Pass column names as list,

It will return a subset DataFrame with same indexes but selected columns only i.e.

Select a single row by Index Label in DataFrame using loc[]

Now we will pass argument ‘:’ in Column range of loc, so that all columns should be included. But for Row Indexes we will pass a label only,

It will return a series object with same indexes equal to DataFrame columns names i.e.

Select multiple rows by Index labels in DataFrame using loc[]

Pass row index labels as list,

It will return a subset DataFrame with same columns as DataFrame but selected indexes only i.e.

Only Rows with index label ‘b’ & ‘c’ are in returned DataFrame object.

Select multiple row & columns by Labels in DataFrame using loc[]

To select multiple rows & column, pass lists containing index labels and column names i.e.

It will return a subset DataFrame with given rows and columns i.e.

Only Rows with index label ‘b’ & ‘c’ and Columns with names ‘Age’, ‘Name’ are in returned DataFrame object.

Instead of passing all the names in index or column list we can pass range also i.e.

It will return a subset DataFrame with rows from a to c & columns from Age to City i.e.

DataFrame.iloc | Select Column Indexes & Rows Index Positions

DataFrame provides indexing label iloc for accessing the column and rows by index positions i.e.

It selects the columns and rows from DataFrame by index position specified in range. If ‘:’ is given in rows or column Index Range then all entries will be included for corresponding row or column.
Let’s see how to use it.

Our DataFrame object dfObj is,

Select a single column by Index position

Select column at index 2 i.e.

It will return a Series object i.e,

Select multiple columns by Index range

Select columns in column index range [0 to 2),

It will return a DataFrame object i.e,

Select multiple columns by Indexes in a list

Select columns at column index 0 and 2,

It will return a DataFrame object i.e,

Select single row by Index Position

Select row at index 2 i.e.

It will return a Series object i.e,

Select multiple rows by Index range

Select rows in row index range 0 to 2,

It will return a DataFrame object i.e,

Select multiple rows by Index positions in a list

Select rows in row index range 0 to 2,

It will return a DataFrame object i.e,

Select multiple rows & columns by Index positions

Select rows at row index 0 and 2,

It will return a DataFrame object i.e,

Select multiple rows & columns by Index positions

Select rows at index 0 & 2 . Also columns at row 1 and 2,

It will return following DataFrame object,

Select multiple rows & columns by Indexes in a range

Select rows at index 0 to 2 (2nd index not included) . Also columns at row 0 to 2 (2nd index not included),

It will return following DataFrame object,

If we try to select an index out of range then it will IndexError.

Selecting Columns in DataFrame using [] operator

To access a single or multiple columns from DataFrame by name we can use dictionary like notation on DataFrame i.e.

Select a Column by Name

It will return a Series object with same indexes as dataFrame i.e.

Select multiple columns by Name

Instead of passing a single name in [] we can pass a list of column names i.e.

It will return a DataFrame object containing only specified columns from given DataFrame object i.e.

On accessing a column name that doesn’t exists it will throw ‘KeyError‘.

Complete example is as follows,

Output:

 

Click Here to Subscribe for more Articles / Tutorials like this.