Duplicate query shows duplicates but I only want one to show

Scoop · Oct 30, 2003

I have a table with duplicate entries. I, obviously, want to remove the
duplicates so that my report does not have multiple entries of the same
data. I have read, and read, about the SELECT DISTINCT but cannot get the
query to ignore duplicate entries. Here is the SQL for the query:

SELECT DISTINCT Sample_Table.Number, Sample_Table.last_name,
Sample_Table.First_name, Sample_Table.Status, Sample_Table.Address,
Sample_Table.City, Sample_Table.State, Sample_Table.ZIP, Sample_Table.Race,
Sample_Table.Sex, Sample_Table.Age, Sample_Table.CompanyFinal,
Sample_Table.AddressFinal, Sample_Table.CityPreferred,
Sample_Table.StateFinal, Sample_Table.ZipFinal, Sample_Table.Leftovers,
FROM Sample_Table
WHERE (((Sample_Table.Number) In (SELECT [Number] FROM [Sample_Table] As Tmp
GROUP BY [Number] HAVING Count(*)>1 )))
ORDER BY Sample_Table.Number;

Why do I still get duplicates in this query? How can I get rid of the
duplicates so that the record only shows up once? Thank you for any ideas
you may have

Ken Snell · Oct 30, 2003

Try changing DISTINCT to DISTINCTROW.

Scoop · Oct 30, 2003

Tried that...it didn't work.

Ken said:
Try changing DISTINCT to DISTINCTROW.

--
Ken Snell
<MS ACCESS MVP>

Scoop said:

I have a table with duplicate entries. I, obviously, want to remove the
duplicates so that my report does not have multiple entries of the same
data. I have read, and read, about the SELECT DISTINCT but cannot get the
query to ignore duplicate entries. Here is the SQL for the query:

SELECT DISTINCT Sample_Table.Number, Sample_Table.last_name,
Sample_Table.First_name, Sample_Table.Status, Sample_Table.Address,
Sample_Table.City, Sample_Table.State, Sample_Table.ZIP, Sample_Table.Race,
Sample_Table.Sex, Sample_Table.Age, Sample_Table.CompanyFinal,
Sample_Table.AddressFinal, Sample_Table.CityPreferred,
Sample_Table.StateFinal, Sample_Table.ZipFinal, Sample_Table.Leftovers,
FROM Sample_Table
WHERE (((Sample_Table.Number) In (SELECT [Number] FROM [Sample_Table] As Tmp
GROUP BY [Number] HAVING Count(*)>1 )))
ORDER BY Sample_Table.Number;

Why do I still get duplicates in this query? How can I get rid of the
duplicates so that the record only shows up once? Thank you for any ideas
you may have

Click to expand...

Tom Ellison · Oct 30, 2003

Dear Scoop:

I am saving a simplified version of your query for study purposes:

SELECT DISTINCT Number, last_name, First_name, Status, Address,
City, State, ZIP, Race, Sex, Age, CompanyFinal, AddressFinal,
CityPreferred, StateFinal, ZipFinal, Leftovers,
FROM Sample_Table
WHERE Number In
(SELECT Number FROM Sample_Table GROUP BY [Number]
HAVING Count(*)>1)
ORDER BY Sample_Table.Number;

I find your filter most exotic. If you run the subquery by itself,
what do you get:

SELECT Number FROM Sample_Table GROUP BY Number
HAVING Count(*) > 1

I take it you are expecting this to eliminate duplicates by removing
all rows where Number occurs more than once. But this will remove all
rows that are duplicated, not leaving even one instance of any row
with a number which occurred more than once in Sample_Table. Is that
what you meant by eliminating duplicates? If two rows have the same
Number you eliminate both? Well, that's one way of eliminating
duplicates!

If you mean to retain one row out of each set of duplicates, another
approach will be needed. But please tell me this: If two rows have
the same Number but different last_name, which one do you wish to keep
and which do you wish to retain? You will have to specify this
exactly if you wish to "eliminate duplicates" because, if you define
duplicate as two rows with the same Number, then you are going to be
reducing the total information available, and you must specify how
this reduction is to be done.

I hope some of this has been illuminating for you.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Scoop · Oct 30, 2003

Tom, thank you for the reply and insight. I see your point. I want to
delete all duplicates EXCEPT ONE. For example:

My query now returns this:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0987 Davis Wilson Active
0987 Davis Wilson Inactive
0786 Thompson John Active
0786 Thompson Jonh Active
0253 Smith Emmitt Inactive
0253 Smith Emmitt Inactive

This is what I want:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0786 Thompson John Active
0253 Smith Emmitt Inactive

I will take a look at your suggestion later today. Thanks again...

Tom said:
Dear Scoop:

I am saving a simplified version of your query for study purposes:

SELECT DISTINCT Number, last_name, First_name, Status, Address,
City, State, ZIP, Race, Sex, Age, CompanyFinal, AddressFinal,
CityPreferred, StateFinal, ZipFinal, Leftovers,
FROM Sample_Table
WHERE Number In
(SELECT Number FROM Sample_Table GROUP BY [Number]
HAVING Count(*)>1)
ORDER BY Sample_Table.Number;

I find your filter most exotic. If you run the subquery by itself,
what do you get:

SELECT Number FROM Sample_Table GROUP BY Number
HAVING Count(*) > 1

I take it you are expecting this to eliminate duplicates by removing
all rows where Number occurs more than once. But this will remove all
rows that are duplicated, not leaving even one instance of any row
with a number which occurred more than once in Sample_Table. Is that
what you meant by eliminating duplicates? If two rows have the same
Number you eliminate both? Well, that's one way of eliminating
duplicates!

If you mean to retain one row out of each set of duplicates, another
approach will be needed. But please tell me this: If two rows have
the same Number but different last_name, which one do you wish to keep
and which do you wish to retain? You will have to specify this
exactly if you wish to "eliminate duplicates" because, if you define
duplicate as two rows with the same Number, then you are going to be
reducing the total information available, and you must specify how
this reduction is to be done.

I hope some of this has been illuminating for you.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

I have a table with duplicate entries. I, obviously, want to remove the
duplicates so that my report does not have multiple entries of the same
data. I have read, and read, about the SELECT DISTINCT but cannot get the
query to ignore duplicate entries. Here is the SQL for the query:

SELECT DISTINCT Sample_Table.Number, Sample_Table.last_name,
Sample_Table.First_name, Sample_Table.Status, Sample_Table.Address,
Sample_Table.City, Sample_Table.State, Sample_Table.ZIP, Sample_Table.Race,
Sample_Table.Sex, Sample_Table.Age, Sample_Table.CompanyFinal,
Sample_Table.AddressFinal, Sample_Table.CityPreferred,
Sample_Table.StateFinal, Sample_Table.ZipFinal, Sample_Table.Leftovers,
FROM Sample_Table
WHERE (((Sample_Table.Number) In (SELECT [Number] FROM [Sample_Table] As Tmp
GROUP BY [Number] HAVING Count(*)>1 )))
ORDER BY Sample_Table.Number;

Why do I still get duplicates in this query? How can I get rid of the
duplicates so that the record only shows up once? Thank you for any ideas
you may have

Click to expand...

Tom Ellison · Oct 30, 2003

Dear Scoop:

Have you considered (and defined) which row to delete and which to
keep in this case:

Number last_name first_Name Status
1234 Johnson George Active
1234 Smith John Active

That is, in general, what if the last_name or first_Name are not
consistent for all the rows where Number is the same. This is even
true in your example (although it may be just a typo) where 0786
Thompson has both "John" and "Jonh" shown.

Your query should have a rule to deal with this. It will not be able
to deal with ambiguous results.

To start with, does your definition of "duplicate" only refer to two
rows with the same value for Number, or some other combination of
columns, such as Number, last_name, and first_Name? If the table were
as you show with both John and Jonh would these two rows duplicate one
another, or not? And, where the Status values do not duplicate (one
is Active the other Inactive) you want the result to be Active as long
as at least one of them is Active, right?

Sorry to have to bug you over such details, but a query to perform
what you want must have all these questions carefully answered. Only
when all the details are well defined can you proceed with writing the
code. A suspicious mind like mind is necessary to uncover all forms
of ambiguity in the specification before you proceed.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Tom said:
Tom, thank you for the reply and insight. I see your point. I want to
delete all duplicates EXCEPT ONE. For example:

My query now returns this:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0987 Davis Wilson Active
0987 Davis Wilson Inactive
0786 Thompson John Active
0786 Thompson Jonh Active
0253 Smith Emmitt Inactive
0253 Smith Emmitt Inactive

This is what I want:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0786 Thompson John Active
0253 Smith Emmitt Inactive

I will take a look at your suggestion later today. Thanks again...

Tom said:

Dear Scoop:

I am saving a simplified version of your query for study purposes:

SELECT DISTINCT Number, last_name, First_name, Status, Address,
City, State, ZIP, Race, Sex, Age, CompanyFinal, AddressFinal,
CityPreferred, StateFinal, ZipFinal, Leftovers,
FROM Sample_Table
WHERE Number In
(SELECT Number FROM Sample_Table GROUP BY [Number]
HAVING Count(*)>1)
ORDER BY Sample_Table.Number;

I find your filter most exotic. If you run the subquery by itself,
what do you get:

SELECT Number FROM Sample_Table GROUP BY Number
HAVING Count(*) > 1

I take it you are expecting this to eliminate duplicates by removing
all rows where Number occurs more than once. But this will remove all
rows that are duplicated, not leaving even one instance of any row
with a number which occurred more than once in Sample_Table. Is that
what you meant by eliminating duplicates? If two rows have the same
Number you eliminate both? Well, that's one way of eliminating
duplicates!

If you mean to retain one row out of each set of duplicates, another
approach will be needed. But please tell me this: If two rows have
the same Number but different last_name, which one do you wish to keep
and which do you wish to retain? You will have to specify this
exactly if you wish to "eliminate duplicates" because, if you define
duplicate as two rows with the same Number, then you are going to be
reducing the total information available, and you must specify how
this reduction is to be done.

I hope some of this has been illuminating for you.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

I have a table with duplicate entries. I, obviously, want to remove the
duplicates so that my report does not have multiple entries of the same
data. I have read, and read, about the SELECT DISTINCT but cannot get the
query to ignore duplicate entries. Here is the SQL for the query:

SELECT DISTINCT Sample_Table.Number, Sample_Table.last_name,
Sample_Table.First_name, Sample_Table.Status, Sample_Table.Address,
Sample_Table.City, Sample_Table.State, Sample_Table.ZIP, Sample_Table.Race,
Sample_Table.Sex, Sample_Table.Age, Sample_Table.CompanyFinal,
Sample_Table.AddressFinal, Sample_Table.CityPreferred,
Sample_Table.StateFinal, Sample_Table.ZipFinal, Sample_Table.Leftovers,
FROM Sample_Table
WHERE (((Sample_Table.Number) In (SELECT [Number] FROM [Sample_Table] As Tmp
GROUP BY [Number] HAVING Count(*)>1 )))
ORDER BY Sample_Table.Number;

Why do I still get duplicates in this query? How can I get rid of the
duplicates so that the record only shows up once? Thank you for any ideas
you may have

Click to expand...

Click to expand...

Scoop · Oct 31, 2003

It really makes no difference. The Number field will be the same for each
duplicate record. Number is a unique ID to each record, but it can show up
multiple times in a table. Therefore, the query can check for duplicates in
the Number field and delete every record (except one, of course) for each
unique Number.

As detailed in my previous example the field Number is unique to each
last_name and first_name. Therefore, if the table has 9 duplicate records
with Number 0278, I want to delete 8 records with Number = 0278.

Tom said:
Dear Scoop:

Have you considered (and defined) which row to delete and which to
keep in this case:

Number last_name first_Name Status
1234 Johnson George Active
1234 Smith John Active

That is, in general, what if the last_name or first_Name are not
consistent for all the rows where Number is the same. This is even
true in your example (although it may be just a typo) where 0786
Thompson has both "John" and "Jonh" shown.

Your query should have a rule to deal with this. It will not be able
to deal with ambiguous results.

To start with, does your definition of "duplicate" only refer to two
rows with the same value for Number, or some other combination of
columns, such as Number, last_name, and first_Name? If the table were
as you show with both John and Jonh would these two rows duplicate one
another, or not? And, where the Status values do not duplicate (one
is Active the other Inactive) you want the result to be Active as long
as at least one of them is Active, right?

Sorry to have to bug you over such details, but a query to perform
what you want must have all these questions carefully answered. Only
when all the details are well defined can you proceed with writing the
code. A suspicious mind like mind is necessary to uncover all forms
of ambiguity in the specification before you proceed.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Tom, thank you for the reply and insight. I see your point. I want to
delete all duplicates EXCEPT ONE. For example:

My query now returns this:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0987 Davis Wilson Active
0987 Davis Wilson Inactive
0786 Thompson John Active
0786 Thompson Jonh Active
0253 Smith Emmitt Inactive
0253 Smith Emmitt Inactive

This is what I want:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0786 Thompson John Active
0253 Smith Emmitt Inactive

I will take a look at your suggestion later today. Thanks again...

Tom said:

Dear Scoop:

I am saving a simplified version of your query for study purposes:

SELECT DISTINCT Number, last_name, First_name, Status, Address,
City, State, ZIP, Race, Sex, Age, CompanyFinal, AddressFinal,
CityPreferred, StateFinal, ZipFinal, Leftovers,
FROM Sample_Table
WHERE Number In
(SELECT Number FROM Sample_Table GROUP BY [Number]
HAVING Count(*)>1)
ORDER BY Sample_Table.Number;

I find your filter most exotic. If you run the subquery by itself,
what do you get:

SELECT Number FROM Sample_Table GROUP BY Number
HAVING Count(*) > 1

I take it you are expecting this to eliminate duplicates by removing
all rows where Number occurs more than once. But this will remove all
rows that are duplicated, not leaving even one instance of any row
with a number which occurred more than once in Sample_Table. Is that
what you meant by eliminating duplicates? If two rows have the same
Number you eliminate both? Well, that's one way of eliminating
duplicates!

If you mean to retain one row out of each set of duplicates, another
approach will be needed. But please tell me this: If two rows have
the same Number but different last_name, which one do you wish to keep
and which do you wish to retain? You will have to specify this
exactly if you wish to "eliminate duplicates" because, if you define
duplicate as two rows with the same Number, then you are going to be
reducing the total information available, and you must specify how
this reduction is to be done.

I hope some of this has been illuminating for you.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

On Wed, 29 Oct 2003 22:38:30 -0500 (EST), "Scoop"

I have a table with duplicate entries. I, obviously, want to remove the
duplicates so that my report does not have multiple entries of the same
data. I have read, and read, about the SELECT DISTINCT but cannot get the
query to ignore duplicate entries. Here is the SQL for the query:

SELECT DISTINCT Sample_Table.Number, Sample_Table.last_name,
Sample_Table.First_name, Sample_Table.Status, Sample_Table.Address,
Sample_Table.City, Sample_Table.State, Sample_Table.ZIP, Sample_Table.Race,
Sample_Table.Sex, Sample_Table.Age, Sample_Table.CompanyFinal,
Sample_Table.AddressFinal, Sample_Table.CityPreferred,
Sample_Table.StateFinal, Sample_Table.ZipFinal, Sample_Table.Leftovers,
FROM Sample_Table
WHERE (((Sample_Table.Number) In (SELECT [Number] FROM [Sample_Table] As Tmp
GROUP BY [Number] HAVING Count(*)>1 )))
ORDER BY Sample_Table.Number;

Why do I still get duplicates in this query? How can I get rid of the
duplicates so that the record only shows up once? Thank you for any ideas
you may have

Click to expand...

Click to expand...

Tom Ellison · Oct 31, 2003

Dear Scoop:

Well, then I'd create a query like this:

SELECT Number, MIN(last_name) AS last_name,
MIN(first_Name) AS first_Name, MIN(Status) AS Status
FROM YourTable
GROUP BY Number

Making the assumption that everything you said about last_name and
first_Name is true, that they are invariant for any value of Number,
then using the MIN() aggregate will simply return the one and only
value of last_name and first_Name. The MIN(Status) will return Active
if there are any rows for a Number value that are Active, and will
return Inactive only when there are no Active rows. I think that's
what you said you want there.

You can then INSERT these rows into a new table and go forward from
there.

You cannot DELETE rows in your existing table as there is no identity
to the rows. By your own descriptions, there exist instances of
complete duplication in this table. You could add an autonumber
column to this table as an identity, then delete "duplicates" based
on retaining only the minimum identity value of each set. But, unless
you create a way of uniquely separating one of the duplicated rows
from another, you can never specify to remove one and not another.

For simplicity, I recommend instead that you create a new table.
Check out the results of the SELECT version of the query first, then
go ahead and create the new table and INSERT. This also has the
advantage of leaving the original table untouched, so you don't have
to have a backup and perhaps need to restore this if you don't like
the results.

If you require the results back into the original table, you can later
delete everything from this table, then append all the rows from the
new table.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Tom said:
It really makes no difference. The Number field will be the same for each
duplicate record. Number is a unique ID to each record, but it can show up
multiple times in a table. Therefore, the query can check for duplicates in
the Number field and delete every record (except one, of course) for each
unique Number.

As detailed in my previous example the field Number is unique to each
last_name and first_name. Therefore, if the table has 9 duplicate records
with Number 0278, I want to delete 8 records with Number = 0278.

Tom said:

Dear Scoop:

Have you considered (and defined) which row to delete and which to
keep in this case:

Number last_name first_Name Status
1234 Johnson George Active
1234 Smith John Active

That is, in general, what if the last_name or first_Name are not
consistent for all the rows where Number is the same. This is even
true in your example (although it may be just a typo) where 0786
Thompson has both "John" and "Jonh" shown.

Your query should have a rule to deal with this. It will not be able
to deal with ambiguous results.

To start with, does your definition of "duplicate" only refer to two
rows with the same value for Number, or some other combination of
columns, such as Number, last_name, and first_Name? If the table were
as you show with both John and Jonh would these two rows duplicate one
another, or not? And, where the Status values do not duplicate (one
is Active the other Inactive) you want the result to be Active as long
as at least one of them is Active, right?

Sorry to have to bug you over such details, but a query to perform
what you want must have all these questions carefully answered. Only
when all the details are well defined can you proceed with writing the
code. A suspicious mind like mind is necessary to uncover all forms
of ambiguity in the specification before you proceed.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Tom, thank you for the reply and insight. I see your point. I want to
delete all duplicates EXCEPT ONE. For example:

My query now returns this:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0987 Davis Wilson Active
0987 Davis Wilson Inactive
0786 Thompson John Active
0786 Thompson Jonh Active
0253 Smith Emmitt Inactive
0253 Smith Emmitt Inactive

This is what I want:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0786 Thompson John Active
0253 Smith Emmitt Inactive

I will take a look at your suggestion later today. Thanks again...

Tom Ellison wrote:

Dear Scoop:

I am saving a simplified version of your query for study purposes:

SELECT DISTINCT Number, last_name, First_name, Status, Address,
City, State, ZIP, Race, Sex, Age, CompanyFinal, AddressFinal,
CityPreferred, StateFinal, ZipFinal, Leftovers,
FROM Sample_Table
WHERE Number In
(SELECT Number FROM Sample_Table GROUP BY [Number]
HAVING Count(*)>1)
ORDER BY Sample_Table.Number;

I find your filter most exotic. If you run the subquery by itself,
what do you get:

SELECT Number FROM Sample_Table GROUP BY Number
HAVING Count(*) > 1

I take it you are expecting this to eliminate duplicates by removing
all rows where Number occurs more than once. But this will remove all
rows that are duplicated, not leaving even one instance of any row
with a number which occurred more than once in Sample_Table. Is that
what you meant by eliminating duplicates? If two rows have the same
Number you eliminate both? Well, that's one way of eliminating
duplicates!

If you mean to retain one row out of each set of duplicates, another
approach will be needed. But please tell me this: If two rows have
the same Number but different last_name, which one do you wish to keep
and which do you wish to retain? You will have to specify this
exactly if you wish to "eliminate duplicates" because, if you define
duplicate as two rows with the same Number, then you are going to be
reducing the total information available, and you must specify how
this reduction is to be done.

I hope some of this has been illuminating for you.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

On Wed, 29 Oct 2003 22:38:30 -0500 (EST), "Scoop"

I have a table with duplicate entries. I, obviously, want to remove the
duplicates so that my report does not have multiple entries of the same
data. I have read, and read, about the SELECT DISTINCT but cannot get the
query to ignore duplicate entries. Here is the SQL for the query:

SELECT DISTINCT Sample_Table.Number, Sample_Table.last_name,
Sample_Table.First_name, Sample_Table.Status, Sample_Table.Address,
Sample_Table.City, Sample_Table.State, Sample_Table.ZIP, Sample_Table.Race,
Sample_Table.Sex, Sample_Table.Age, Sample_Table.CompanyFinal,
Sample_Table.AddressFinal, Sample_Table.CityPreferred,
Sample_Table.StateFinal, Sample_Table.ZipFinal, Sample_Table.Leftovers,
FROM Sample_Table
WHERE (((Sample_Table.Number) In (SELECT [Number] FROM [Sample_Table] As Tmp
GROUP BY [Number] HAVING Count(*)>1 )))
ORDER BY Sample_Table.Number;

Why do I still get duplicates in this query? How can I get rid of the
duplicates so that the record only shows up once? Thank you for any ideas
you may have

Click to expand...

Click to expand...

Scoop · Nov 2, 2003

TOMMMMM!!!! YOU ARE THE KING!!!!!!

Worked like a charm. Just a few minor little changes and voila, worked
beautifully. Thank you so much.

Tom said:
Dear Scoop:

Well, then I'd create a query like this:

SELECT Number, MIN(last_name) AS last_name,
MIN(first_Name) AS first_Name, MIN(Status) AS Status
FROM YourTable
GROUP BY Number

Making the assumption that everything you said about last_name and
first_Name is true, that they are invariant for any value of Number,
then using the MIN() aggregate will simply return the one and only
value of last_name and first_Name. The MIN(Status) will return Active
if there are any rows for a Number value that are Active, and will
return Inactive only when there are no Active rows. I think that's
what you said you want there.

You can then INSERT these rows into a new table and go forward from
there.

You cannot DELETE rows in your existing table as there is no identity
to the rows. By your own descriptions, there exist instances of
complete duplication in this table. You could add an autonumber
column to this table as an identity, then delete "duplicates" based
on retaining only the minimum identity value of each set. But, unless
you create a way of uniquely separating one of the duplicated rows
from another, you can never specify to remove one and not another.

For simplicity, I recommend instead that you create a new table.
Check out the results of the SELECT version of the query first, then
go ahead and create the new table and INSERT. This also has the
advantage of leaving the original table untouched, so you don't have
to have a backup and perhaps need to restore this if you don't like
the results.

If you require the results back into the original table, you can later
delete everything from this table, then append all the rows from the
new table.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

It really makes no difference. The Number field will be the same for each
duplicate record. Number is a unique ID to each record, but it can show up
multiple times in a table. Therefore, the query can check for duplicates in
the Number field and delete every record (except one, of course) for each
unique Number.

As detailed in my previous example the field Number is unique to each
last_name and first_name. Therefore, if the table has 9 duplicate records
with Number 0278, I want to delete 8 records with Number = 0278.

Tom said:

Dear Scoop:

Have you considered (and defined) which row to delete and which to
keep in this case:

Number last_name first_Name Status
1234 Johnson George Active
1234 Smith John Active

That is, in general, what if the last_name or first_Name are not
consistent for all the rows where Number is the same. This is even
true in your example (although it may be just a typo) where 0786
Thompson has both "John" and "Jonh" shown.

Your query should have a rule to deal with this. It will not be able
to deal with ambiguous results.

To start with, does your definition of "duplicate" only refer to two
rows with the same value for Number, or some other combination of
columns, such as Number, last_name, and first_Name? If the table were
as you show with both John and Jonh would these two rows duplicate one
another, or not? And, where the Status values do not duplicate (one
is Active the other Inactive) you want the result to be Active as long
as at least one of them is Active, right?

Sorry to have to bug you over such details, but a query to perform
what you want must have all these questions carefully answered. Only
when all the details are well defined can you proceed with writing the
code. A suspicious mind like mind is necessary to uncover all forms
of ambiguity in the specification before you proceed.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

On Thu, 30 Oct 2003 15:01:15 -0500 (EST), "Scoop"

Tom, thank you for the reply and insight. I see your point. I want to
delete all duplicates EXCEPT ONE. For example:

My query now returns this:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0987 Davis Wilson Active
0987 Davis Wilson Inactive
0786 Thompson John Active
0786 Thompson Jonh Active
0253 Smith Emmitt Inactive
0253 Smith Emmitt Inactive

This is what I want:

Number last_name First_Name Status .....
0987 Davis Wilson Active
0786 Thompson John Active
0253 Smith Emmitt Inactive

I will take a look at your suggestion later today. Thanks again...

Tom Ellison wrote:

Dear Scoop:

I am saving a simplified version of your query for study purposes:

SELECT DISTINCT Number, last_name, First_name, Status, Address,
City, State, ZIP, Race, Sex, Age, CompanyFinal, AddressFinal,
CityPreferred, StateFinal, ZipFinal, Leftovers,
FROM Sample_Table
WHERE Number In
(SELECT Number FROM Sample_Table GROUP BY [Number]
HAVING Count(*)>1)
ORDER BY Sample_Table.Number;

I find your filter most exotic. If you run the subquery by itself,
what do you get:

SELECT Number FROM Sample_Table GROUP BY Number
HAVING Count(*) > 1

I take it you are expecting this to eliminate duplicates by removing
all rows where Number occurs more than once. But this will remove all
rows that are duplicated, not leaving even one instance of any row
with a number which occurred more than once in Sample_Table. Is that
what you meant by eliminating duplicates? If two rows have the same
Number you eliminate both? Well, that's one way of eliminating
duplicates!

If you mean to retain one row out of each set of duplicates, another
approach will be needed. But please tell me this: If two rows have
the same Number but different last_name, which one do you wish to keep
and which do you wish to retain? You will have to specify this
exactly if you wish to "eliminate duplicates" because, if you define
duplicate as two rows with the same Number, then you are going to be
reducing the total information available, and you must specify how
this reduction is to be done.

I hope some of this has been illuminating for you.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

On Wed, 29 Oct 2003 22:38:30 -0500 (EST), "Scoop"

I have a table with duplicate entries. I, obviously, want to remove the
duplicates so that my report does not have multiple entries of the same
data. I have read, and read, about the SELECT DISTINCT but cannot get the
query to ignore duplicate entries. Here is the SQL for the query:

SELECT DISTINCT Sample_Table.Number, Sample_Table.last_name,
Sample_Table.First_name, Sample_Table.Status, Sample_Table.Address,
Sample_Table.City, Sample_Table.State, Sample_Table.ZIP, Sample_Table.Race,
Sample_Table.Sex, Sample_Table.Age, Sample_Table.CompanyFinal,
Sample_Table.AddressFinal, Sample_Table.CityPreferred,
Sample_Table.StateFinal, Sample_Table.ZipFinal, Sample_Table.Leftovers,
FROM Sample_Table
WHERE (((Sample_Table.Number) In (SELECT [Number] FROM [Sample_Table] As Tmp
GROUP BY [Number] HAVING Count(*)>1 )))
ORDER BY Sample_Table.Number;

Why do I still get duplicates in this query? How can I get rid of the
duplicates so that the record only shows up once? Thank you for any ideas
you may have

Click to expand...

Click to expand...