今天朋友问了一个特别纠结的问题:
在数据库的一张表里面对根据表中的某一字段进行去重复并对剩下的数据根据另外一个字段排序,
创建表如下:
create table TEST_DISTINCT(
ID integer not null,
NAME varchar(20) not null
);
insert into TEST_DISTINCT values(0,'aa');
insert into TEST_DISTINCT values(1,'bb');
insert into TEST_DISTINCT values(2,'cc');
insert into TEST_DISTINCT values(3,'dd');
insert into TEST_DISTINCT values(4,'ee');
insert into TEST_DISTINCT values(5,'aa');
insert into TEST_DISTINCT values(6,'cc');
一看到去重复就想到了 DISTINCT 关键字:
select distinct name from test_distinct;
可惜却无法和字段id关联,于是度娘看见最火的一种写法:
select *, count(distinct name) from table group by name;
select *,count(distinct ename) from emp group by name;
可惜运行直接报错,于是改成了:
select job,ename,count(distinct ename) from emp group by job,ename;
报错是没了,但是还是没效果,后来研究了个把小时还是无法解决,不知道是不是因为版本原因,总之感觉度娘还是不靠谱!
后来还是找到了解决方法:
两行数据要去掉一个,是应该去掉第一行还是第二行这种情况像电脑这种啥东西肯定不会选择的,于是得用户自己选择
我的方法是:
select max(id) as mt,name from (select * from test_distinct where id > 2) group by name order by mt asc;
如果还有其他字段:
select * from emp where emp.mgr in (
select mm from (
select max(mgr) as mm,ename from (select * from emp where empno > 7521) group by ename order by mm asc
)
nn);
当解决后才发现是如此简单,看见去重走向了distinct的误区了,
另外发这种文章得有品,不能随便发点错的误人子弟啊!