In this article, we will learn how to read a specific column from CSV file in python
Table Of Contents
- What is csv file?
- Reading specific columns by name from CSV file using read_csv() and usecols attribute.
- Reading specific columns by index from CSV file using read_csv() and usecols attribute.
- Reading specific columns by index from CSV file using loadtxt() and usecols attribute.
- Reading specific columns by Name from CSV file using genfromtxt() and usecols attribute.
- Summary
What is csv file?
A CSV (comma-separated values) is a text file that uses a comma to separate values. CSV allows data to be saved in a tabular format. Each line of the CSV is a row. There are multiple ways to read a specific column from CSV file. Let’s discuss all the methods one by one with proper approach and a working code example.
The following are the CSV file name and its contents which I will be using in the code.
File name: sample.csv
Frequently Asked:
Contents of the file
Name,Age,Gender A,10,M B,14,F C,20,M D,17,F E,18,F
Reading specific columns by name from CSV file using read_csv() and usecols attribute.
The pandas module has a read_csv() method, and it reads a CSV into a dataframe. It takes a file path as input and returns a dataframe. To read only specific columns of CSV we can pass the names of the columns as a list to read_csv().
Syntax of read_csv() function
pandas.read_csv(filepath, usecols)
- Parameters:
- filepath: Path of CSV file.
- usecols: List of names of columns to be read.
- Returns:
- A DataFrame.
Approach:
Latest Python - Video Tutorial
- Import pandas library.
- Pass the file path of the CSV to the read_csv() along with the list of column names.
- It returns a DataFrame with the specified columns.
Source Code
import pandas as pd # Reading specific columns from the CSV (By Column Names) df = pd.read_csv("sample.csv", usecols = ['Name','Gender']) print(df)
Output:
Name Gender 0 A M 1 B F 2 C M 3 D F 4 E F
Reading specific columns by index from CSV file using read_csv() and usecols attribute.
The pandas module has a read_csv() method, and it reads a CSV into a dataframe. It takes a file path as input and returns a dataframe. To read only specific columns of CSV we can pass the indexes of the columns as a list to read_csv().
Syntax of read_csv() function
pandas.read_csv(filepath, usecols)
- Parameters:
- filepath: Path of CSV file.
- usecols: List of indexes of columns to be read.
- Returns:
- A DataFrame.
Approach:
- Import pandas library.
- Pass the file path of the CSV to the read_csv() along with the list of column indexes.
- It returns a DataFrame with the specified columns.
Source Code
import pandas as pd # Reading specific columns from the CSV (By Column Number) df = pd.read_csv("sample.csv", usecols = [0,1]) print(df)
Output:
Name Age 0 A 10 1 B 14 2 C 20 3 D 17 4 E 18
Reading specific columns by index from CSV file using loadtxt() and usecols attribute.
The NumPy module has a loadtxt()
method, and it is used to read text files. To read specific columns of a CSV file pass the delimiter as ,(comma) and indexes of columns to be read to the loadtxt()
method.
Syntax of loadtxt() function
numpy.loadtxt(filepath, dtype, delimiter, usecols)
- Parameters:
- filepath: The path of the CSV file.
- dtype: Data type of resulting array.
- delimiter: The string used to separate values.
- usecols: sequence of indexes of columns to be read.
- Returns:
- A ndarray.
Approach:
- Import pandas library.
- Pass the file path of the CSV to the read_csv() along with the sequence of column indexes.
- It returns a ndarray with the specified columns data from the CSV.
Source Code
import numpy as np # Reading specific columns from # the CSV (By Column Numbers) arr = np.loadtxt( 'sample.csv', dtype = str, delimiter = ',', usecols = (1,2) ) print(arr)
Output:
[['Age' 'Gender'] ['10' 'M'] ['14' 'F'] ['20' 'M'] ['17' 'F'] ['18' 'F']]
Reading specific columns by Name from CSV file using genfromtxt() and usecols attribute.
The NumPy module has a genfromtxt()
method, and it is used to read text files. To read specific columns of a CSV file pass the delimiter as ,(comma) and list of names of columns to be read to the loadtxt()
method.
Syntax of genfromtxt() function
numpy.genfromtxt(filepath, delimiter, usecols)
- Parameters:
- filepath: The path of the CSV file.
- delimiter: The string used to separate values.
- usecols: sequence of names of columns to be read.
- Returns:
- A ndarray.
Approach:
- Import pandas library.
- Pass the file path of the CSV to the genfromtxt() along with the list of column names.
- It returns a ndarray with the specified columns data from the CSV.
Source Code
import numpy as np # Reading specific columns from # the CSV (By Column Numbers) arr = np.genfromtxt( 'sample.csv', delimiter = ',', names = True, dtype = None, encoding = None, usecols = ['Name','Age']) print(arr)
Output:
[('A', 10) ('B', 14) ('C', 20) ('D', 17) ('E', 18)]
Summary
Great! you made it, We have discussed all possible methods to read a specific column from CSV file in python. Happy learning.
Latest Video Tutorials