Remove Duplicate Rows in MySQL
Remove duplicate rows in MySQL is a common query for many initial MySQL user. I have been working on MySQL for more than six years and in my initial starting I had to search over the internet for this easy thing. Though there are a lot of articles on the same topics over the internet but I hope the following procedure will help to understand “Remove Duplicate Rows in MySQL” more easily
Step 1 : Create a simple table
create table test (id int, name varchar );
Step 2: Insert some rows with duplicate id
insert into test(id,name) values(1,'name1'); insert into test(id,name) values(2,'name2'); insert into test(id,name) values(3,'name3'); insert into test(id,name) values(1,'name1'); insert into test(id,name) values(1,'name1'); insert into test(id,name) values(2,'name2');
mysql> select * from test; +------+-------+ | id | name | +------+-------+ | 1 | name1 | | 2 | name2 | | 3 | name3 | | 1 | name1 | | 1 | name1 | | 2 | name2 | +------+-------+ 6 rows in set (0.02 sec)
Id 1 has inserted 3 times and id 2 inserted 2 times.
Step 3: Alter the column to unique
Create the column unique that is your base column to remove duplicate rows. At our case if you want to remove duplicate id then alter the column id to unique. As the column id has duplicate values the general alter column will not work. You have to insert “ignore” clause after alter clause
mysql> alter ignore table test add unique(id); Query OK, 6 rows affected (0.13 sec) Records: 6 Duplicates: 3 Warnings: 0
You see the above output shows “Records: 6 Duplicates: 3” which means there were six records and 3 duplicate records were removed. Check again duplicate records were actually removed
mysql> select * from test; +------+-------+ | id | name | +------+-------+ | 1 | name1 | | 2 | name2 | | 3 | name3 | +------+-------+ 3 rows in set (0.00 sec)
If you don’t need the unique constraint for column id you can revoke it to its original attribute. Check what is the unique constraint name by the following command
mysql> show index from test; +-------+------------+----------+--------------+-------------+----------- | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation +-------+------------+----------+--------------+-------------+----------- | test | 0 | id | 1 | id | A +-------+------------+----------+--------------+-------------+----------- 1 row in set (0.00 sec)
Here Key_name is the unique constraint name. Now drop the unique constraint by the following command
mysql> alter table test drop index id;
The procedure is same to remove duplicate rows based on multiple column value. Here I show how to remove duplicate rows based on multiple columns
create table test2 (id1 int, id2 int );
insert into test2(id1,id2) values(1,2); insert into test2(id1,id2) values(2,2); insert into test2(id1,id2) values(3,2); insert into test2(id1,id2) values(2,2); insert into test2(id1,id2) values(3,3); insert into test2(id1,id2) values(6,2);
mysql> select * from test2; +------+------+ | id1 | id2 | +------+------+ | 1 | 2 | | 2 | 2 | | 3 | 2 | | 2 | 2 | | 3 | 3 | | 6 | 2 | +------+------+ 6 rows in set (0.00 sec)
Here record 2 and record 4 are duplicate. So run unique command as follows
mysql> alter ignore table test2 add unique(id1,id2); Query OK, 6 rows affected (0.05 sec) Records: 6 Duplicates: 1 Warnings: 0
mysql> select * from test2; +------+------+ | id1 | id2 | +------+------+ | 1 | 2 | | 2 | 2 | | 3 | 2 | | 3 | 3 | | 6 | 2 | +------+------+ 5 rows in set (0.00 sec)
There are other procedures to remove duplicate rows in MySQL but I think the above procedure is the quickest one.