Remove Duplicate Rows in MySQL

Comments Off on Remove Duplicate Rows in MySQL

Remove Duplicate Rows in MySQL

Remove duplicate rows in MySQL is a common query for many initial MySQL user. I  have been working on MySQL  for more than six years and in my initial starting I had to search over the internet for  this easy thing.  Though there are a lot of articles on the same topics over the internet but I hope the following procedure will help to understand  “Remove Duplicate Rows in MySQL” more easily

Step 1 : Create a simple table

create table test
(id int,
name varchar
);

Step 2: Insert some rows with duplicate id

insert into test(id,name) values(1,'name1');
insert into test(id,name) values(2,'name2');
insert into test(id,name) values(3,'name3');
insert into test(id,name) values(1,'name1');
insert into test(id,name) values(1,'name1');
insert into test(id,name) values(2,'name2');
mysql> select * from test;
+------+-------+
| id   | name  |
+------+-------+
|    1 | name1 |
|    2 | name2 |
|    3 | name3 |
|    1 | name1 |
|    1 | name1 |
|    2 | name2 |
+------+-------+
6 rows in set (0.02 sec)

Id 1 has inserted  3 times and id 2 inserted 2 times.

Step 3:  Alter  the column to  unique

Create the column unique that  is your base column to remove duplicate rows. At our case if you  want to remove duplicate id then  alter the column id to unique. As the column id has duplicate values the general alter column will not work. You have to insert “ignore” clause after alter clause

mysql> alter ignore table test add unique(id);
Query OK, 6 rows affected (0.13 sec)
Records: 6  Duplicates: 3  Warnings: 0

You see the above output shows “Records: 6 Duplicates: 3” which means there were six records and 3 duplicate records were removed. Check again duplicate records were actually removed

mysql> select * from test;
+------+-------+
| id   | name  |
+------+-------+
|    1 | name1 |
|    2 | name2 |
|    3 | name3 |
+------+-------+
3 rows in set (0.00 sec)

If you don’t need the unique constraint for column id you can revoke it to its original attribute. Check what is the unique constraint name by the following command

mysql> show index from test;
+-------+------------+----------+--------------+-------------+-----------
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation 
+-------+------------+----------+--------------+-------------+-----------
| test  |          0 | id       |            1 | id          | A         
+-------+------------+----------+--------------+-------------+-----------
1 row in set (0.00 sec)

Here Key_name is the unique constraint name. Now drop the unique constraint by the following command

mysql> alter table test drop index id;

The procedure is same to remove duplicate rows based on multiple column value. Here I show how to remove duplicate rows based on multiple columns

create table test2
(id1 int,
id2 int
);
insert into test2(id1,id2) values(1,2);
insert into test2(id1,id2) values(2,2);
insert into test2(id1,id2) values(3,2);
insert into test2(id1,id2) values(2,2);
insert into test2(id1,id2) values(3,3);
insert into test2(id1,id2) values(6,2);
mysql> select * from test2;
+------+------+
| id1  | id2  |
+------+------+
|    1 |    2 |
|    2 |    2 |
|    3 |    2 |
|    2 |    2 |
|    3 |    3 |
|    6 |    2 |
+------+------+
6 rows in set (0.00 sec)

Here record 2 and record 4 are duplicate. So run unique command as follows

mysql> alter ignore table test2 add unique(id1,id2);
Query OK, 6 rows affected (0.05 sec)
Records: 6  Duplicates: 1  Warnings: 0
mysql> select * from test2;
+------+------+
| id1  | id2  |
+------+------+
|    1 |    2 |
|    2 |    2 |
|    3 |    2 |
|    3 |    3 |
|    6 |    2 |
+------+------+
5 rows in set (0.00 sec)

There are other procedures to remove duplicate rows in MySQL but I think the above procedure is the quickest one.