Select Rows where Two Columns are equal in Pandas

This tutorial will discuss about different ways to select DataFrame rows where two columns are equal in pandas.

Table Of Contents

Preparing DataSet

Let’s create a DataFrame with some hardcoded data.

import pandas as pd

data = {'Col_A': [33, 12, 33, 14, 35, 36, 17],
        'Col_B': [21, 22, 23, 24, 25, 26, 27],
        'Col_C': [33, 32, 33, 35, 35, 36, 35]}

index = ['X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7']

# Create a DataFrame from a dictionary
df = pd.DataFrame.from_dict(data)

# Set the index list as Index of DataFrame
df.set_index(pd.Index(index), inplace=True)

print (df)

Output

    Col_A  Col_B  Col_C
X1     33     21     33
X2     12     22     32
X3     33     23     33
X4     14     24     35
X5     35     25     35
X6     36     26     36
X7     17     27     35

We will now select rows from this DataFrame where values in two specified columns are equal.

Method 1: Using DataFrame.query()

We can use the query() method of DataFrame to select only those rows from DataFrame where values in two specific columns are equal. For example, we want to select only those rows from the DataFrame where column Col_A and column Col_C has equal values.

For that, we will apply a condition on both the columns, and pass that to the query() method. It will return a DataFrame containing only those rows where values in column Col_A and column Col_C are equal.

# Select rows where column Col_A and 'Col_C'
# has equal values
subDf = df.query('Col_A == Col_C')

print (subDf)

Output

    Col_A  Col_B  Col_C
X1     33     21     33
X3     33     23     33
X5     35     25     35
X6     36     26     36

Method 2: Using DataFrame.loc[] Attribute

You can also use the loc[] attribute of DataFrame, to select rows from a DataFrame where two given columns has equal values.

Like in the below example, we are going to select rows that has equal values in Columns ‘Col_A’ and ‘Col_C’. For that, we will apply a condition on both the columns, and it will return as a boolean Series, where each True value represent that that corresponding row has equal values in both the columns i.e. ‘Col_A’ and ‘Col_C’.

Then we will pass that boolean series in the loc[] attribute and it will return a DataFrame containing the rows where both the given columns has equal values in original DataFrame.

Like in the below example, we are going to select only those rows from DataFrame where column Col_A and column Col_C has equal values.

# Select rows where column 'Col_A' and 'Col_C'
# has equal values
subDf = df.loc[df['Col_A'] == df['Col_C']]

print (subDf)

Output

    Col_A  Col_B  Col_C
X1     33     21     33
X3     33     23     33
X5     35     25     35
X6     36     26     36

Summary

We learned to select DataFrame rows where tow columns has unequal values. Thanks.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top