Read a specific column from CSV file in Python

In this article, we will learn how to read a specific column from CSV file in python

Table Of Contents

What is csv file?

A CSV (comma-separated values) is a text file that uses a comma to separate values. CSV allows data to be saved in a tabular format. Each line of the CSV is a row. There are multiple ways to read a specific column from CSV file. Let’s discuss all the methods one by one with proper approach and a working code example.

The following are the CSV file name and its contents which I will be using in the code.

File name: sample.csv

Advertisements

Contents of the file

Name,Age,Gender
A,10,M
B,14,F
C,20,M
D,17,F
E,18,F

Reading specific columns by name from CSV file using read_csv() and usecols attribute.

The pandas module has a read_csv() method, and it reads a CSV into a dataframe. It takes a file path as input and returns a dataframe. To read only specific columns of CSV we can pass the names of the columns as a list to read_csv().

Syntax of read_csv() function

pandas.read_csv(filepath, usecols)
  • Parameters:
    • filepath: Path of CSV file.
    • usecols: List of names of columns to be read.
  • Returns:
    • A DataFrame.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the read_csv() along with the list of column names.
  3. It returns a DataFrame with the specified columns.

Source Code

import pandas as pd

# Reading specific columns from the CSV (By Column Names)
df = pd.read_csv("sample.csv", usecols = ['Name','Gender'])

print(df)

Output:

  Name Gender
0    A      M
1    B      F
2    C      M
3    D      F
4    E      F

Reading specific columns by index from CSV file using read_csv() and usecols attribute.

The pandas module has a read_csv() method, and it reads a CSV into a dataframe. It takes a file path as input and returns a dataframe. To read only specific columns of CSV we can pass the indexes of the columns as a list to read_csv().

Syntax of read_csv() function

pandas.read_csv(filepath, usecols)
  • Parameters:
    • filepath: Path of CSV file.
    • usecols: List of indexes of columns to be read.
  • Returns:
    • A DataFrame.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the read_csv() along with the list of column indexes.
  3. It returns a DataFrame with the specified columns.

Source Code

import pandas as pd

# Reading specific columns from the CSV (By Column Number)
df = pd.read_csv("sample.csv", usecols = [0,1])

print(df)

Output:

  Name  Age
0    A   10
1    B   14
2    C   20
3    D   17
4    E   18

Reading specific columns by index from CSV file using loadtxt() and usecols attribute.

The NumPy module has a loadtxt() method, and it is used to read text files. To read specific columns of a CSV file pass the delimiter as ,(comma) and indexes of columns to be read to the loadtxt() method.

Syntax of loadtxt() function

numpy.loadtxt(filepath, dtype, delimiter, usecols)
  • Parameters:
    • filepath: The path of the CSV file.
    • dtype: Data type of resulting array.
    • delimiter: The string used to separate values.
    • usecols: sequence of indexes of columns to be read.
  • Returns:
    • A ndarray.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the read_csv() along with the sequence of column indexes.
  3. It returns a ndarray with the specified columns data from the CSV.

Source Code

import numpy as np 

# Reading specific columns from
# the CSV (By Column Numbers)
arr = np.loadtxt(
        'sample.csv',
        dtype = str,
        delimiter = ',',
        usecols = (1,2) )

print(arr)

Output:

[['Age' 'Gender']
 ['10' 'M']
 ['14' 'F']
 ['20' 'M']
 ['17' 'F']
 ['18' 'F']]

Reading specific columns by Name from CSV file using genfromtxt() and usecols attribute.

The NumPy module has a genfromtxt() method, and it is used to read text files. To read specific columns of a CSV file pass the delimiter as ,(comma) and list of names of columns to be read to the loadtxt() method.

Syntax of genfromtxt() function

numpy.genfromtxt(filepath, delimiter, usecols)
  • Parameters:
    • filepath: The path of the CSV file.
    • delimiter: The string used to separate values.
    • usecols: sequence of names of columns to be read.
  • Returns:
    • A ndarray.

Approach:

  1. Import pandas library.
  2. Pass the file path of the CSV to the genfromtxt() along with the list of column names.
  3. It returns a ndarray with the specified columns data from the CSV.

Source Code

import numpy as np 

# Reading specific columns from
# the CSV (By Column Numbers)
arr = np.genfromtxt(
            'sample.csv',
            delimiter = ',',
            names = True,
            dtype = None,
            encoding = None,
            usecols = ['Name','Age'])

print(arr)

Output:

[('A', 10)
 ('B', 14)
 ('C', 20)
 ('D', 17)
 ('E', 18)]

Summary

Great! you made it, We have discussed all possible methods to read a specific column from CSV file in python. Happy learning.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top