In this article, we will learn how to remove columns from a NumPy Array which contain NaN values.
Table Of Contents
What is NaN value?
The NaN stands for Not a Number, which is a numeric data type and it can be interpreted as a value that is undefined or unrepresentable. Usually the NaN values are used to represent the missing data in a DataFrame or a NumPy Array.
Given a NumPy array we need to Remove columns with nan values, from a 2D NumPy Array i.e delete the columns which has Nan values.
Example:
Given array : [[ 1 2 3 4 5] [nan, 4, nan, 2, 1], [nan, 2, 4, 1, 5], [ 3 4 3 2 1]] After removing columns with nan values : [[2. 4. 5.] [4. 2. 1.] [2. 1. 5.] [4. 2. 1.]]
There are multiple ways to remove columns with NaN values, from a NumPy Array. Lets discuss all the methods one by one with proper approach and a working code example
Delete columns containing atleast one NaN values using delete(), isnan() and any()
The delete() method is a builtin method in the numpy library. It is used to delete the elements from the given array. The delete() method takes an array and an index or array of indices as parameters. It returns a copy of array after deleting the elements at given index.
Syntax of delete()
numpy.delete(arr, obj, axis)
 Parameters:
 arr = The array from which we need to delete the elements.
 obj = index (or array of indices) of the columns to be deleted.
 axis = Axis along which elements needs to be deleted. For columns axis = 1.
 Returns:
 Returns a copy of array with the columns removed.
In this example, to delete the columns containing atleast one NaN value, we need to use any() function and isnan() function. First we will pass the given 2D NumPy Array to the isnan() function. It will return a 2D array of same size but with the boolean values. Each True value in this boolean array indicates that the corresponding value in original array is NaN.
Then pass this boolean array to the any() method. It will return an another boolean array but its length will be equal to the number of columns in original array. Each True value in this array indicates that the corresponding column in original array has any NaN value. Then pass this boolean array to the delete() method along with the given array, if the value in the boolean index is true then the corresponding column from array will be deleted.
Source Code
import numpy as np # creating numpy array arr = np.array([[1, 2, 3, 4, 5], [np.nan, 4, np.nan, 2, 1], [np.nan, 2, 4, 1, 5], [3, 4, 3, 2, 1]]) # Get an index of columns which has any NaN value index = np.isnan(arr).any(axis=0) # Delete columns with any NaN value from 2D NumPy Array arr = np.delete(arr, index,axis=1) print(arr)
Output:
[[2. 4. 5.] [4. 2. 1.] [2. 1. 5.] [4. 2. 1.]]
Delete columns containing all NaN values using delete(), isnan() and all()
This is very much similar to the above approach except that we use all() method instead of any() method.
In this example, to delete the columns containing all NaN values, we need to use all() function and isnan() function. First we will pass the given 2D NumPy Array to the isnan() function of numpy module. It will return a 2D NumPy array of equal size but with the bool values only. Each True value in this indicates that the corresponding value in original NumPy Array is NaN.
Then pass this boolean array to the all() method. It will return an another bool array containing elements equal to the number of columns in original array. Each True value in this array indicates that the corresponding column in original array has all NaN values in it. Then pass this boolean array to the delete() method along with the given array, if the value in the boolean index is True then the corresponding column from NumPy array will be deleted.
Source Code
import numpy as np # Creating numpy array arr = np.array([[np.nan, 2, 3, 4, 5], [np.nan, 4, 3, 2, 1], [np.nan, 2, 4, 1, 5], [np.nan, 4, 3, 2, 1]]) # Get an index of columns which has all NaN values index = np.isnan(arr).all(axis=0) # Delete columns with all NaN values from a 2D NumPy Array arr = np.delete(arr, index,axis=1) print(arr)
Output:
[[2. 3. 4. 5.] [4. 3. 2. 1.] [2. 4. 1. 5.] [4. 3. 2. 1.]]
Using boolean index to delete columns with any NaN value
This approach is very much similar to the previous one. Instead of the delete() method we will pass the boolean index to the array as index. The Columns in a numpy array can be accessed by passing a boolean array as index to the array.
Example
Given array : [[ 1, 2, 3, 4, 5] [ 5, 4, 3, 2, 1], [ 1, 2, 4, 1, 5], [ 3, 4, 3, 2, 1]] boolArray = [False, True, False, True, True] arr[: , boolArray] will be: [[2. 4. 5.] [4. 2. 1.] [2. 1. 5.] [4. 2. 1.]]
It selected all the columns for which index had True values.
Steps to remove columns with any NaN value:
 Import numpy library and create numpy array.
 Create a boolean array using any() and isnan() and negate it. True value in indicates the corresponding column has no NaN value
 Pass the boolean array as index to the array.
 This will return the array with the columns having NaN values deleted.
 Print the Array.
Source Code
import numpy as np # creating numpy array arr = np.array([[1, 2, 3, 4, 5], [np.nan, 4, np.nan, 2, 1], [np.nan, 2, 4, 1, 5], [3, 4, 3, 2, 1]]) # Get the indices of column with no NaN value booleanIndex = ~np.isnan(arr).any(axis=0) # Select columns which have no NaN value arr = arr[:,booleanIndex] print(arr)
Output:
[[2. 4. 5.] [4. 2. 1.] [2. 1. 5.] [4. 2. 1.]]
Using boolean index to delete columns with all nan values
This is very much similar to the approach 3, instead of the any() method we will use the all() method. The Columns in a numpy array can be accessed by passing a boolean array as index to the array
Example:
Given array : [[ 1, 2, 3, 4, 5] [ 5, 4, 3, 2, 1], [ 1, 2, 4, 1, 5], [ 3, 4, 3, 2, 1]] boolArray = [False, True, False, True, True] arr[: , boolArray] : [[2. 4. 5.] [4. 2. 1.] [2. 1. 5.] [4. 2. 1.]]
It selected all the columns for which index had True values.
Steps to remove columns with any NaN value:
 Import numpy library and create numpy array.
 Create a boolean array using all() and isnan() and negate it. False value in indicates the corresponding column has all NaN values
 Pass the boolean array as index to the array.
 This will return the array with the columns with all NaN values deleted.
 Print the Array.
Source Code
import numpy as np # creating numpy array arr = np.array([[np.nan, 2, 3, 4, 5], [np.nan, 4, np.nan, 2, 1], [np.nan, 2, 4, 1, 5], [np.nan, 4, 3, 2, 1]]) # Get the indices of columns in which all values are not NaN booleanIndex = ~np.isnan(arr).all(axis=0) # Select columns in which all values are not NaN arr = arr[:,booleanIndex] print(arr)
Output:
[[ 2. 3. 4. 5.] [ 4. nan 2. 1.] [ 2. 4. 1. 5.] [ 4. 3. 2. 1.]]
Summary
Great! you made it, We have discussed All possible methods to Remove Columns with NaN values in NumPy Array. Happy learning
Pandas Tutorials Learn Data Analysis with Python

Pandas Tutorial Part #1  Introduction to Data Analysis with Python

Pandas Tutorial Part #2  Basics of Pandas Series

Pandas Tutorial Part #3  Get & Set Series values

Pandas Tutorial Part #4  Attributes & methods of Pandas Series

Pandas Tutorial Part #5  Add or Remove Pandas Series elements

Pandas Tutorial Part #6  Introduction to DataFrame

Pandas Tutorial Part #7  DataFrame.loc[]  Select Rows / Columns by Indexing

Pandas Tutorial Part #8  DataFrame.iloc[]  Select Rows / Columns by Label Names

Pandas Tutorial Part #9  Filter DataFrame Rows

Pandas Tutorial Part #10  Add/Remove DataFrame Rows & Columns

Pandas Tutorial Part #11  DataFrame attributes & methods

Pandas Tutorial Part #12  Handling Missing Data or NaN values

Pandas Tutorial Part #13  Iterate over Rows & Columns of DataFrame

Pandas Tutorial Part #14  Sorting DataFrame by Rows or Columns

Pandas Tutorial Part #15  Merging or Concatenating DataFrames

Pandas Tutorial Part #16  DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most soughtafter professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.