Re: VERY chalanging question - Explanation
- From: "jsfromynr" <jatinder.1975@xxxxxxxxx>
- Date: 31 May 2006 04:35:24 -0700
Hi There,
IF there is no identity column then we may use.
Select identity(int,1,1) myid ,A,B,C,D into tmpTable from select * from
BASETABLE;
create index tmpA on tmpTable(A,myid);
create index tmpB on tmpTable(B,myid);
create index tmpC on tmpTable(C,myid);
create index tmpD on tmpTable(D,myid);
Assuming that there is a column rowid which is monotonically increasing
and there are as many covering indexes as there are columns the query
can become like this .
Delete from tmpTable where myId in
(
Select myID from tmpTable group by A having count(*)>1
Union All
Select myID from tmpTable group by B having count(*)>1
Union All
Select myID from tmpTable group by C having count(*)>1
Union All
Select myID from tmpTable group by D having count(*)>1
);
Hope this serve the purpose.
With Warm regards
Jatinder Singh
groupy wrote:
ok, let's take a look at a sample table representing the problem:
A | B | C | D
--------------------
a1 b1 c1 d1
a1 b2 c2 d2
a1 b1 c3 d3
a4 b4 c4 d3
a5 b5 c5 d5
a6 b6 c6 d3
The duplications are:
rows 1+2+3 on A
row 1+3 on B
rows 3+4+6 on D
the only unique (in all params) row is 5
note: finding first that row 1 similar to 2 on A & deleting it will
loose information because we WON'T know if row 1 similar to row 3 on B.
The same goes for the deletion of row 3 : it will cause lose of data
regarding it's similarity to row 4 on D
The Simple query for retriving all duplicated rows which consumes most
time is:
SELECT COUNT(*),A,B,C,D
FROM tbl
GROUP BY A,B,C,D
HAVING count(*)>1
It takes about 2 weaks on a 1.5 million rows, while all fields are
nvchars & the DB is in SQL-Server
THANK YOU ALL
.
- References:
- VERY chalanging question
- From: groupy
- Re: VERY chalanging question
- From: Hugo Kornelis
- VERY chalanging question - Explanation
- From: groupy
- VERY chalanging question
- Prev by Date: Re: VERY chalanging question - Explanation
- Next by Date: Re: VERY chalanging question - Explanation
- Previous by thread: Re: VERY chalanging question - Explanation
- Next by thread: Re: VERY chalanging question - Explanation
- Index(es):