Read a specific column from CSV file in Python

In this article, we will learn how to read a specific column from CSV file in python

Table Of Contents

What is csv file?

A CSV (comma-separated values) is a text file that uses a comma to separate values. CSV allows data to be saved in a tabular format. Each line of the CSV is a row. There are multiple ways to read a specific column from CSV file. Let’s discuss all the methods one by one with proper approach and a working code example.

The following are the CSV file name and its contents which I will be using in the code.

File name: sample.csv

Contents of the file

Name,Age,Gender
A,10,M
B,14,F
C,20,M
D,17,F
E,18,F

Reading specific columns by name from CSV file using read_csv() and usecols attribute.

The pandas module has a read_csv() method, and it reads a CSV into a dataframe. It takes a file path as input and returns a dataframe. To read only specific columns of CSV we can pass the names of the columns as a list to read_csv().

Syntax of read_csv() function

pandas.read_csv(filepath, usecols)
  • Parameters:
    • filepath: Path of CSV file.
    • usecols: List of names of columns to be read.
  • Returns:
    • A DataFrame.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the read_csv() along with the list of column names.
  3. It returns a DataFrame with the specified columns.

Source Code

import pandas as pd

# Reading specific columns from the CSV (By Column Names)
df = pd.read_csv("sample.csv", usecols = ['Name','Gender'])

print(df)

Output:

  Name Gender
0    A      M
1    B      F
2    C      M
3    D      F
4    E      F

Reading specific columns by index from CSV file using read_csv() and usecols attribute.

The pandas module has a read_csv() method, and it reads a CSV into a dataframe. It takes a file path as input and returns a dataframe. To read only specific columns of CSV we can pass the indexes of the columns as a list to read_csv().

Syntax of read_csv() function

pandas.read_csv(filepath, usecols)
  • Parameters:
    • filepath: Path of CSV file.
    • usecols: List of indexes of columns to be read.
  • Returns:
    • A DataFrame.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the read_csv() along with the list of column indexes.
  3. It returns a DataFrame with the specified columns.

Source Code

import pandas as pd

# Reading specific columns from the CSV (By Column Number)
df = pd.read_csv("sample.csv", usecols = [0,1])

print(df)

Output:

  Name  Age
0    A   10
1    B   14
2    C   20
3    D   17
4    E   18

Reading specific columns by index from CSV file using loadtxt() and usecols attribute.

The NumPy module has a loadtxt() method, and it is used to read text files. To read specific columns of a CSV file pass the delimiter as ,(comma) and indexes of columns to be read to the loadtxt() method.

Syntax of loadtxt() function

numpy.loadtxt(filepath, dtype, delimiter, usecols)
  • Parameters:
    • filepath: The path of the CSV file.
    • dtype: Data type of resulting array.
    • delimiter: The string used to separate values.
    • usecols: sequence of indexes of columns to be read.
  • Returns:
    • A ndarray.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the read_csv() along with the sequence of column indexes.
  3. It returns a ndarray with the specified columns data from the CSV.

Source Code

import numpy as np 

# Reading specific columns from
# the CSV (By Column Numbers)
arr = np.loadtxt(
        'sample.csv',
        dtype = str,
        delimiter = ',',
        usecols = (1,2) )

print(arr)

Output:

[['Age' 'Gender']
 ['10' 'M']
 ['14' 'F']
 ['20' 'M']
 ['17' 'F']
 ['18' 'F']]

Reading specific columns by Name from CSV file using genfromtxt() and usecols attribute.

The NumPy module has a genfromtxt() method, and it is used to read text files. To read specific columns of a CSV file pass the delimiter as ,(comma) and list of names of columns to be read to the loadtxt() method.

Syntax of genfromtxt() function

numpy.genfromtxt(filepath, delimiter, usecols)
  • Parameters:
    • filepath: The path of the CSV file.
    • delimiter: The string used to separate values.
    • usecols: sequence of names of columns to be read.
  • Returns:
    • A ndarray.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the genfromtxt() along with the list of column names.
  3. It returns a ndarray with the specified columns data from the CSV.

Source Code

import numpy as np 

# Reading specific columns from
# the CSV (By Column Numbers)
arr = np.genfromtxt(
            'sample.csv',
            delimiter = ',',
            names = True,
            dtype = None,
            encoding = None,
            usecols = ['Name','Age'])

print(arr)

Output:

[('A', 10)
 ('B', 14)
 ('C', 20)
 ('D', 17)
 ('E', 18)]

Summary

Great! you made it, We have discussed all possible methods to read a specific column from CSV file in python. Happy learning.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top