Select first N columns of pandas dataframe

In this article, we will discuss different ways to select the first N columns of a dataframe in pandas.

Table of Contents

There are different ways to select the first N columns of a dataframe. Let’s discuss them one by one,

Use iloc[] to select first N columns of pandas dataframe

In Pandas, the Dataframe provides an attribute iloc[], to select a portion of the dataframe using position based indexing. This selected portion can be few columns or rows . We can use this attribute to select first N columns of the dataframe. For example,

N = 5
# Select first N columns
first_n_column  = df.iloc[: , :N]

We selected a portion of dataframe object, that included all rows, but only first N columns of the dataframe.

Advertisements

How did it work?

The syntax of dataframe.iloc[] is like,

df.iloc[row_start:row_end , col_start, col_end]

Arguments:

  • row_start: The row index/position from where it should start selection. Default is 0.
  • row_end: The row index/position from where it should end the selection i.e. select till row_end-1. Default is till the last row of the dataframe.
  • col_start: The column index/position from where it should start selection. Default is 0.
  • col_end: The column index/position from where it should end the selection i.e. select till end-1. Default is till the last column of the dataframe.

It returns a portion of the dataframe that includes rows from row_start to row_end-1 and columns from col_start to col_end-1.

To select the first N columns of the dataframe, select from column index 0 till N i.e (:N) and select all rows using default values (:),

N = 5
# Select first N columns
first_n_columns  = df.iloc[: , :N]

We provided the range to select the columns from 0 position till N, to select the first N columns, therefore it returned a dataframe. Checkout complete example to select first N columns of dataframe using iloc,

import pandas as pd

# List of Tuples
empoyees = [('Jack',  34, 11, 51, 33, 34, 77, 88) ,
            ('Riti',  31, 12, 71, 56, 55, 99, 11) ,
            ('Aadi',  16, 13, 11, 44, 55, 33, 54) ,
            ('Mark',  41, 14, 12, 78, 89, 46, 56)]

# Create a DataFrame object
df = pd.DataFrame(  empoyees)

print("Contents of the Dataframe : ")
print(df)


N = 5
# Select first N columns
first_n_columns  = df.iloc[: , :N]

print("First 5 Columns Of Dataframe : ")
print(first_n_columns)

print('Type:')
print(type(first_n_columns))

Output:

Contents of the Dataframe :
      0   1   2   3   4   5   6   7
0  Jack  34  11  51  33  34  77  88
1  Riti  31  12  71  56  55  99  11
2  Aadi  16  13  11  44  55  33  54
3  Mark  41  14  12  78  89  46  56
First 5 Columns Of Dataframe :
      0   1   2   3   4
0  Jack  34  11  51  33
1  Riti  31  12  71  56
2  Aadi  16  13  11  44
3  Mark  41  14  12  78
Type:
<class 'pandas.core.frame.DataFrame'>

We selected the first N columns of the dataframe.

Select first N columns of pandas dataframe using []

We can fetch the column names of dataframe as a sequence and then select the first N column names. Then using those column name, we can select the first N columns of dataframe using subscript operator i.e. []. For example,

print("Contents of the Dataframe : ")
print(df)

N = 5
# Select first 5 columns
first_n_columns = df[df.columns[:N]]

print("First 5 Columns Of Dataframe : ")
print(first_n_columns)

print('Type:')
print(type(first_n_columns))

Output:

Contents of the Dataframe :
      0   1   2   3   4   5   6   7
0  Jack  34  11  51  33  34  77  88
1  Riti  31  12  71  56  55  99  11
2  Aadi  16  13  11  44  55  33  54
3  Mark  41  14  12  78  89  46  56
First 5 Columns Of Dataframe :
      0   1   2   3   4
0  Jack  34  11  51  33
1  Riti  31  12  71  56
2  Aadi  16  13  11  44
3  Mark  41  14  12  78
Type:
<class 'pandas.core.frame.DataFrame'>

Use head() to select the first N columns of pandas dataframe

We can use the dataframe.T attribute to get a transposed view of the dataframe and then call the head(N) function on that view to select the first N rows i.e. the first N columns of the original dataframe. Then transpose back that dataframe object to have the column contents as a dataframe object. For example,

print("Contents of the Dataframe : ")
print(df)

N = 5

# Select first 5 columns
first_n_columns = df.T.head(N).T

print("First 5 Columns Of Dataframe : ")
print(first_n_columns)

print('Type:')
print(type(first_n_columns))

Output:

Contents of the Dataframe :
      0   1   2   3   4   5   6   7
0  Jack  34  11  51  33  34  77  88
1  Riti  31  12  71  56  55  99  11
2  Aadi  16  13  11  44  55  33  54
3  Mark  41  14  12  78  89  46  56
First 5 Columns Of Dataframe :
      0   1   2   3   4
0  Jack  34  11  51  33
1  Riti  31  12  71  56
2  Aadi  16  13  11  44
3  Mark  41  14  12  78
Type:
<class 'pandas.core.frame.DataFrame'>

It returned the first N columns of dataframe as a dataframe object.

Summary

We learned different ways to get the first N columns of a dataframe in pandas.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top