Find duplicate rows in MySQL

This article will look into how to find duplicate rows in MySQL based on one column and multiple columns.

Let us start by making the data that we will be using in our examples. We are creating a table salesTeam_Address and inserting some rows, including duplicate values in one or more columns.

#CREATE THE TABLE
CREATE TABLE salesTeam_Address (
    sales_person_id INT AUTO_INCREMENT,
    sales_person_name VARCHAR(255),
    sales_person_email VARCHAR(255),
    sales_person_postalAddress  VARCHAR(255),	
    PRIMARY KEY (sales_person_id)
);
# INSERT THE DATA
INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Aditi","[email protected]","23 Peter Road Mikson");
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Furan T","[email protected]","29 Peter Road West"); 
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Veronica Hedge","[email protected]","33 Highway Steet");
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Atharv","[email protected]","206 Prateek Wisteria"); 
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Erick","[email protected]","45 Parking West London"); 
  INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Rasmus","[email protected]","67 FollowMart Arhus"); 
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Aditi Sharma","[email protected]","899 Fellow Road London");
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Furan T","[email protected]","29 Peter Road West"); 
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Veronica Longman","[email protected]","89 Longman Road DK");
  INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Simon Rappid","[email protected]","78 Parking South"); 
 INSERT INTO salesTeam_Address (sales_person_name,sales_person_email,sales_person_postalAddress) 
 VALUES("Simon Rappid","[email protected]","78 Parking South");

If we run the query select * on salesTeam_Address, we will see duplicate values in columns sales_person_email, sales_person_postalAddress. 

Output:-

figure 1.1

As we can see, there are rows with the same values in the column sales_person_email, but a different value in sales_person_postalAddress (marked in red – figure 1.1) also shows some rows with the same values for both sales_person_email and sales_person_postalAddress (marked in yellow – figure 1.1).

MySQL find duplicate rows based on one column : Using GROUP BY

Let us get the duplicate rows from table sales_person_postalAddress based on the sales_person_email column. Observe the below query and output.

SELECT COUNT(*) occurance, sales_person_email FROM salesTeam_Address GROUP BY sales_person_email HAVING occurance > 1;

Output:-

figure 1.2

Explanation: The output shows the count of emails, which are present more than once. For example, [email protected]” is there two times in the table for column sales_person_email. Here we are selecting the sales_person_email and count of duplicate values using the GROUP BY clause.

MySQL find duplicate rows based on one column : Using JOINS

We can also find rows with duplicate values in one column using JOINS. Observe the below query and its output.

SELECT s1.sales_person_id, s1.sales_person_name, s1.sales_person_email , s1.sales_person_postalAddress
FROM salesTeam_Address AS s1, salesTeam_Address AS s2
WHERE s1.sales_person_email = s2.sales_person_email
AND s1.sales_person_id != s2.sales_person_id
ORDER BY sales_person_id;

Output :-

figure 1.3

Explanation: Here, we get only the rows with duplicate values in column sales_person_email.

  • We are using table aliases s1 and s2 for the same table salesTeam_Address
  • Then doing a JOIN on column sales_person_email.
  • Finally, ensuring the same record doesn’t show up multiple times in our result set (avoiding duplicate of duplicates).

Find duplicate rows in MySQL with multiple columns : Using GROUP BY

The GROUP BY clause can be used to get the duplicate rows based on multiple columns. In the below query, we will be getting the duplicate rows based on both sales_person_email and sales_person_postalAddress. 

SELECT 
    sales_person_email,  COUNT(sales_person_email),
    sales_person_postalAddress,  COUNT(sales_person_postalAddress)
FROM
    salesTeam_Address
GROUP BY 
    sales_person_email , 
    sales_person_postalAddress 
HAVING COUNT(sales_person_email) > 1
AND 
COUNT(sales_person_postalAddress) > 1; 

Output :-

figure 1.4

Explanation:- The output displays only the rows which have duplicates present for both the columns sales_person_email and sales_person_postalAddress. In the query, we are selecting the sales_person_email, sales_person_postalAddress, and the count of duplicate values in both the columns using the GROUP BY clause.

Find duplicate rows in MySQL with multiple columns : Using JOINS

We can also find rows with duplicate values in multiple columns using JOINS. Observe the below query and its output.

SELECT 
    s1.sales_person_id,
    s1.sales_person_name,
    s1.sales_person_email,
    s1.sales_person_postalAddress
FROM
    salesTeam_Address AS s1,
    salesTeam_Address AS s2
WHERE
    s1.sales_person_email = s2.sales_person_email
    AND s1.sales_person_postalAddress = s2.sales_person_postalAddress
    AND s1.sales_person_id != s2.sales_person_id;

Output:-

figure 1.5

Explanation: Here, we are getting the rows with duplicate values in both the columns sales_person_emailsales_person_postalAddress.

  • We are using table aliases s1 and s2 for the same table salesTeam_Address.
  • Then doing a JOIN on column sales_person_email and sales_person_postalAddress.
  • Finally, ensuring the same record doesn’t show up multiple times in our result set (avoiding duplicate of duplicates).
We hope this article helped you to get rows with duplicate values in one or multiple columns. Good Luck !!.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top