delete query

Wim · Feb 20, 2004

Hi,

I inherited a huge database (Clipper) where the main
source table has not been normalised, so there are
sometimes multiple occurences of the same record.
I want to physically remove the doubles, or triples, ...
Say the key field is IdNum, text, 14 long.
I can use DISTINCTROW, but this does not remove
redundancies. I can use the query builder wizard that
looks up the multiple occurences. I could use the LAST
statement to just keep the last occurrence. But how can I
physically remove these double records that have not been
kept by the LAST key word ?

Thanks,

Wim

John Spencer (MVP) · Feb 22, 2004

IF this is a one time event, build a new table structure with your fields and
set the IDNum field as the primary key. Then Append all the records from the
old table to the new table. You will get errors saying you can't add x records
due to key conflicts. The key conflicts are all but ONE of the duplicated records.

Once you've confirmed that the new table contains the correct data, you can
delete (or rename for safety) the old table. Then rename the New table to the
old table's name.

Wim · Feb 23, 2004

John,

No, this is not a one time event. In fact I rely on data
from the outside world and not everybody seems to know the
rules of mrs. Codd and Boyce.

Thanks for your answer at any rate,

Wim

John Vinson · Feb 23, 2004

Hi,

I inherited a huge database (Clipper) where the main
source table has not been normalised, so there are
sometimes multiple occurences of the same record.
I want to physically remove the doubles, or triples, ...
Say the key field is IdNum, text, 14 long.
I can use DISTINCTROW, but this does not remove
redundancies. I can use the query builder wizard that
looks up the multiple occurences. I could use the LAST
statement to just keep the last occurrence. But how can I
physically remove these double records that have not been
kept by the LAST key word ?

LAST is actually rather less useful than its name suggests: it refers
to the last record *in disk storage order*, and disk storage order is
arbitrary and uncontrollable. If the records are in fact identical I'd
use FIRST instead.

What I'd suggest is to create a new table with a Primary Key on IDNUM
and run an Append query to move the data from this table into it. New
data coming from Clipper can be appended as well - you'll get warning
messages that "x records were not added due to key violations" in both
cases, but these are the duplicates which you want to get rid of in
any case.

If there are records which have duplicate IDNUM values but *different*
data, you have a different and more difficult problem!

delete query

Wim

John Spencer (MVP)

Wim

John Vinson