In this tutorial, we will discuss how to use the iloc property of the Dataframe and select rows, columns, or a subset of DataFrame based on the index positions or range of index positions. Then we will also discuss the way to change the selected values.
DataFrame.iloc[]
In Pandas, the Dataframe provides a property iloc[], to select the subset of Dataframe based on position indexing. This subset’s spread will be decided based on the provided index positions of rows & columns. We can select single or multiple rows & columns using it. Let’s learn more about it,
Syntax:
Dataframe.iloc[row_segment , column_segment] Dataframe.iloc[row_segment]
The column_segment argument is optional. Therefore, if column_segment is not provided, iloc [] will select the subset of Dataframe based on row_segment argument only.
Arguments:
- row_segement:
- It contains information about the index positions of rows to be selected. Its value can be,
- An integer like N.
- In this case, it selects the single row at index position N.
- For example, if 2 only is given, then only the 3rd row of the Dataframe will be selected because indexing starts from 0.
- A list/array of integers like [a, b, c].
- In this case, multiple rows will be selected based on index positions in the given list.
- For example, if [2, 4, 0] is given as argument in row segment, then 3rd, 5th and 1st row of the Dataframe will be selected.
- A slice object with ints like -> a:e .
- This case will select multiple rows from index position a to e-1.
- For example, if 2:5 is provided in the row segment of iloc[], it will select a range of rows from index positions 2 to 4.
- For selecting all rows, provide the value ( : )
- A boolean sequence of same size as number of rows.
- In this case, it will select only those rows for which the corresponding value in boolean array/list is True.
- A callable function :
- It can be a lambda function or general function, which accepts the calling dataframe as an argument and returns valid output for indexing. This returned output should match with any of the indexing arguments mentioned above.
- An integer like N.
- It contains information about the index positions of rows to be selected. Its value can be,
- column_segement:
- It is optional.
- It contains the information about the index positions of columns to be selected. Its value can be,
- An integer like N.
- In this case a single column at index position N will be selected.
- For example, if 3 is given, only the 4th column of the Dataframe will be selected because indexing starts from 0.
- A list/array of integers like [a, b, c].
- In this case, multiple columns will be selected i.e. columns at index positions given in list.
- For example, if [2, 4, 0] is given as argument in column segment, then 3rd, 5th and 1st column of the Dataframe will be selected.
- A slice object with ints like a:e.
- In this case it will select multiple columns index position a to e-1.
- For example, if 2:5 is given in the column segment of iloc[], it will select a range of columns from index positions 2 to 4.
- For selecting all columns, provide the value ( : )
- A boolean sequence of the same size as the number of columns.
- This case will select only those columns for which the corresponding value in the boolean array/list is True.
- A callable function :
- It can be a lambda function or general function, which accepts the calling dataframe as an argument and returns valid output for indexing. This returned output should match with any of the indexing arguments mentioned above.
- An integer like N.
Returns :
It returns a reference to the selected subset of the dataframe based on index positions specified in row and column segments.
Also, if column_segment is not provided, it returns the subset of the Dataframe containing only selected rows based on the row_segment argument.
Frequently Asked:
- Pandas: Select rows without NaN values
- Pandas: Sum rows in Dataframe ( all or certain rows)
- Drop last row of pandas dataframe in python (3 ways)
- Replace NaN with 0 in Pandas DataFrame
Error scenarios:
Dataframe.iloc[row_sgement, column_segement] will give IndexError, if any request index position is out of bounds.
Let’s understand more about it with some examples,
Pandas Dataframe.iloc[] – Examples
We have divided examples in three parts i.e.
Let’s look at these examples one by one. First we will create a Dataframe from list of tuples,
import pandas as pd # List of Tuples students = [('jack', 34, 'Sydeny', 'Australia'), ('Riti', 30, 'Delhi', 'India'), ('Vikas', 31, 'Mumbai', 'India'), ('Neelu', 32, 'Bangalore', 'India'), ('John', 16, 'New York', 'US'), ('Mike', 17, 'las vegas', 'US')] # Create a DataFrame from list of tuples df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Country'], index=['a', 'b', 'c', 'd', 'e', 'f']) print(df)
Output
Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India e John 16 New York US f Mike 17 las vegas US
Select few rows from Dataframe
Here we will provide only row segment argument to the Dataframe.iloc[]. Therefore it will select rows based on given indices and all columns.
Select a single row of Dataframe
To select a row from the dataframe, pass the row index position to the iloc[]. For example,
# Select row at index position 2 i.e. the 3rd row of Dataframe row = df.iloc[2] print(row)
Output:
Name Vikas Age 31 City Mumbai Country India Name: c, dtype: object
It returned the 3rd row of the Dataframe as a Series object. As indexing starts from 0, therefore row at index position 2 is the 3rd row of the Dataframe.
Select multiple rows from Dataframe based on a list of indices
Pass a list of row index positions to the row_segment of iloc[]. It will return a subset of the Dataframe containing only the rows mentioned at given indexes. For example,
# Select rows of Dataframe based on row indices in list subsetDf = df.iloc[ [2,4,1] ] print(subsetDf)
Output:
Name Age City Country c Vikas 31 Mumbai India e John 16 New York US b Riti 30 Delhi India
It returned a subset of the Dataframe containing only three rows from the original dataframe i.e. rows at index positions 2, 4, and 1.
Select multiple rows from Dataframe based on index range
Pass an index range -> start:end-1 in row segment of iloc. It will return a subset of the Dataframe containing only the rows from index position start to end-1 from the original dataframe. For example,
# Select rows of Dataframe based on row index range subsetDf = df.iloc[ 1:4 ] print(subsetDf)
Output:
Name Age City Country b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India
It returned a subset of the Dataframe containing only three rows from the original dataframe i.e. rows at index positions 1 to 3.
Select rows of Dataframe based on bool array
Pass a boolean array/list in the row segment of iloc[]. It will return a subset of the Dataframe containing only the rows for which the corresponding value in the boolean array/list is True. For example,
# Select rows of Dataframe based on bool array subsetDf = df.iloc[ [True, False, True, False, True, False] ] print(subsetDf)
Output:
Name Age City Country a jack 34 Sydeny Australia c Vikas 31 Mumbai India e John 16 New York US
Select rows of Dataframe based on Callable function
Create a lambda function that accepts a dataframe as an argument, applies a condition on a column, and returns a bool list. This bool list will contain True only for those rows where the condition is True. Pass this lambda function to iloc[] and returns only those rows will be selected for which condition returns True in the list.
For example, select only those rows where column ‘Age’ has a value of more than 25,
# Select rows of Dataframe based on callable function subsetDf = df.iloc[ lambda x : (x['Age'] > 25).tolist() ] print(subsetDf)
Output:
Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India
Select a few Columns from Dataframe
Here we will provide the (:) in the row segment argument of the Dataframe.iloc[]. Therefore it will select all rows, but only a few columns based on the indices provided in column_segement.
Select a single column of Dataframe
To select a column from the dataframe, pass the column index number to the iloc[]. For example,
# Select single column by index position column = df.iloc[:, 2] print(column)
Output:
a Sydeny b Delhi c Mumbai d Bangalore e New York f las vegas Name: City, dtype: object
It returned the 3rd column of the Dataframe as a Series object. As indexing starts from 0, therefore column at index number 2 is the 3rd column of the Dataframe.
Select multiple columns from Dataframe based on a list of indices
Pass a list of column index numbers to the column_segment of iloc[]. It will return a subset of the Dataframe containing only the columns mentioned at given indexes. For example,
# Select multiple columns by indices subsetDf = df.iloc[:, [2, 3, 1]] print(subsetDf)
Output:
City Country Age a Sydeny Australia 34 b Delhi India 30 c Mumbai India 31 d Bangalore India 32 e New York US 16 f las vegas US 17
It returned a subset of the Dataframe containing only three columns from the original dataframe i.e. columns at index numbers 2, 3, and 1.
Select multiple columns from Dataframe based on index range
Pass an index range -> start:end-1 in column segment of iloc. It will return a subset of the Dataframe containing only the columns from index number start to end-1 from the original dataframe. For example,
# Select multiple columns by index range subsetDf = df.iloc[:, 1 : 4] print(subsetDf)
Output:
Age City Country a 34 Sydeny Australia b 30 Delhi India c 31 Mumbai India d 32 Bangalore India e 16 New York US f 17 las vegas US
It returned a subset of the Dataframe containing only three columns from the original dataframe i.e. columns at index numbers 1 to 3.
Select columns of Dataframe based on bool array
Pass a boolean array/list in the column segment of iloc[]. It will return a subset of the Dataframe containing only the columns for which the corresponding value in the boolean array/list is True. For example,
# Select columns of Dataframe based on bool array subsetDf = df.iloc[ : , [True, True, False, False] ] print(subsetDf)
Output:
Name Age a jack 34 b Riti 30 c Vikas 31 d Neelu 32 e John 16 f Mike 17
Select a subset of Dataframe
Here we will provide the row and column segment arguments of the Dataframe.iloc[]. It will return a subset of Dataframe based on the row and column indices provided in row and column segments of iloc[].
Select a Cell value from Dataframe
To select a single cell value from the dataframe, just pass the row and column number in the row and column segment of iloc[]. For example,
# Select a Cell value from Dataframe cellValue = df.iloc[3,2] print(cellValue)
Output:
Bangalore
It returned the cell value at position (3,2) i.e. in the 4th row and 3rd column, because indexing starts from 0.
Select subset of Dataframe based on row/column indices in list
Select a subset of the dataframe. This subset should include the following rows and columns,
- Rows at index positions 1 and 3.
- Columns at index positions 2 and 1.
# Select sub set of Dataframe based on row/column indices in list subsetDf = df.iloc[[1,3],[2,1]] print(subsetDf)
Output:
City Age b Delhi 30 d Bangalore 32
It returned a subset from the calling dataframe object.
Select subset of Dataframe based on row/column index range
Select a subset of the dataframe. This subset should include the following rows and columns,
- Rows from index position 1 to 4
- Columns from index position 1 to 3
# Select subset of Dataframe based on row and column index range. subsetDf = df.iloc[1:4, 1:4] print(subsetDf)
Output:
Age City Country b 30 Delhi India c 31 Mumbai India d 32 Bangalore India
It returned a subset from the calling dataframe object.
Pro Tip: Changing the values of Dataframe using iloc[]
iloc[] returns a view object, so any changes made in the returned subset will be reflected in the original Dataframe object. For example, let’s select the 3rd row of the dataframe using iloc[] and change its content,
print(df) # change the value of 3rd row of Dataframe df.iloc[2] = 0 print(df)
Output:
Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India e John 16 New York US f Mike 17 las vegas US Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c 0 0 0 0 d Neelu 32 Bangalore India e John 16 New York US f Mike 17 las vegas US
Changes made to the view object returned by iloc[], will also change the content of the original dataframe.
Summary:
We learned about how to use the Dataframe.iloc[] with several examples,