In this article we will discuss how to find unique elements in a single, multiple or each column of a dataframe.

Series.unique()

It returns the a numpy array of unique elements in series object.

Series.nunique()

It returns the count of unique elements in the series object.

DataFrame.nunique(self, axis=0, dropna=True)

It returns the count of unique elements along different axis.

  • If axis = 0 : It returns a series object containing the count of unique elements in each column.
  • If axis = 1 : It returns a series object containing the count of unique elements in each row.
  • Default value of axis is 0.

Now let’s use these functions to find unique element related information from a dataframe.

First of all, create a dataframe,

Contents of this dataframe are,

Now let’s see how to find the unique values in single or multiple columns of this dataframe.

Find unique values in a single column

To fetch the unique values in column ‘Age’ of the above created dataframe, we will call unique() function on the column i.e.

Output:

empDfObj[‘Age’] returns a series object representing column ‘Age’ of the dataframe. Then on calling unique() function on that series object returns the unique element in that series i.e. unique elements in column ‘Age’ of the dataframe.

Count unique values in a single column

Suppose instead of getting the name of unique values in a column, if we are interested in count of unique elements in a column then we can use series.unique() function i.e.

Output:

It returns the count of unique elements in column ‘Age’ of the dataframe.

Include NaN while counting the unique elements in a column

Using nunique() with default arguments doesn’t include NaN while counting the unique elements, if we want to include NaN too then we need to pass the dropna argument i.e.

Output:

It returns the count of unique elements in column ‘Age’ of the dataframe including NaN.

Count unique values in each column of the dataframe

In Dataframe.nunique() default value of axis is 0 i.e. it returns the count of unique elements in each column i.e.

Output:

It didn’t included the NaN while counting because default value of argument dropna is True. To include the NaN pass the value of dropna argument as False i.e.

Output:

It returns the count of unique elements in each column including NaN. Column Age & City has NaN therefore their count of unique elements increased from 4 to 5.

Get Unique values in a multiple columns

To get the unique values in multiple columns of a dataframe, we can merge the contents of those columns to create a single series object and then can call unique() function on that series object i.e.

Output:

It returns the count of unique elements in multiple columns.

Complete example is as follows,

Output

If you didn't find what you were looking, then do suggest us in the comments below. We will be more than happy to add that.

Do Subscribe with us for more Articles / Tutorials like this,