Sorting and using the first criteria

  • Thread starter Thread starter sdg8481
  • Start date Start date
S

sdg8481

hi,

I have a very annoying problem that i can't seem to solve, so i was
wondering if any one can help.

Basically, i have a table with 139,084 rows, this table is made contains 11
rows each per zipcode, each row contains a zipcode, a place name, a distance
value (from zipcode to place name) and a population for that zipcode. For
each zipcode I need to find the row associated with shortest distance and
return that row only, hence in the end only be left with 12,644 unique rows
(139,084/11).

The way i have approached this is by making a make table query based on the
original table which sorts the data by Zipcode (asc) and then by Distance
(asc).
Then using a second query, i query this table, setting the criteria to
'First' on all the fields except the zipcode (and obviously the distance
field). On the face of it this appears to work and by eye it seems to work,
and i get the first one back for each zipcode, which i then sum the
populations.

HOWEVER, for some reason i get different results each time i run the set of
queries (or macro) only very slight but different everytime, I've tried and
looked at everything but this one has me beat.

Can anyone suggest any reasons or potential new approaches i can use !!

thanks very much, i'm just about to pull the last hair out my head.

S
 
Thank Ken, this is brilliant and seems to be working (taking a while and
causes access to say not responding but is eventually working), and the text
about the 'First' criteria explains a lot on why i was getting slightly
different results.

I'm going to be very cheeky now though, and ask a couple more things.

1) is it possible to make this a Make-Table Query aswell, and
2) if i also need to add a criteria that excludes certain places name (ie...
what is the next closest place if a certain place is excluded). How would
this fit into the SQL statement.

Sorry for my stupidness.

Thanks

KenSheridan via AccessMonster.com said:
You can do it in a single query, restricting the result by means of a
subquery which returns the MIN distance per zipdode, e.g.

SELECT *
FROM [YourTable] AS T1
WHERE [Distance] =
(SELECT MIN([Distance])
FROM [YourTable] AS T2
WHERE T2.[ZipCode] = T1.[ZipCode]);

By giving the two instances of the table aliases T1 and T2 th subquery can be
correlated with the outer query on ZipCode so that it returns the MIN
distance per zipcode. The outer query is then restricted to those rows where
the distance equals the value returned by the subquery.

BTW the FIRST operator doesn't do what you might think. Its really only of
use where you want a random (in terms of values) row per grouping returned as
a table is a set, and sets have no intrinsic order, so concepts like first
and last are pretty meaningless. To return the highest or lowest values the
MIN and MAX operators should be used.

Ken Sheridan
Stafford, England
hi,

I have a very annoying problem that i can't seem to solve, so i was
wondering if any one can help.

Basically, i have a table with 139,084 rows, this table is made contains 11
rows each per zipcode, each row contains a zipcode, a place name, a distance
value (from zipcode to place name) and a population for that zipcode. For
each zipcode I need to find the row associated with shortest distance and
return that row only, hence in the end only be left with 12,644 unique rows
(139,084/11).

The way i have approached this is by making a make table query based on the
original table which sorts the data by Zipcode (asc) and then by Distance
(asc).
Then using a second query, i query this table, setting the criteria to
'First' on all the fields except the zipcode (and obviously the distance
field). On the face of it this appears to work and by eye it seems to work,
and i get the first one back for each zipcode, which i then sum the
populations.

HOWEVER, for some reason i get different results each time i run the set of
queries (or macro) only very slight but different everytime, I've tried and
looked at everything but this one has me beat.

Can anyone suggest any reasons or potential new approaches i can use !!

thanks very much, i'm just about to pull the last hair out my head.

S

--
Message posted via AccessMonster.com


.
 
sorry it might be useful if i post the current SQL statement;

SELECT *
FROM [T01a_Distance To Each Site With Population_sorted] AS T1
WHERE [Dist] =
(SELECT MIN([Dist])
FROM [T01a_Distance To Each Site With Population_sorted] AS T2
WHERE T2.[FromID] = T1.[FromID]);



sdg8481 said:
Thank Ken, this is brilliant and seems to be working (taking a while and
causes access to say not responding but is eventually working), and the text
about the 'First' criteria explains a lot on why i was getting slightly
different results.

I'm going to be very cheeky now though, and ask a couple more things.

1) is it possible to make this a Make-Table Query aswell, and
2) if i also need to add a criteria that excludes certain places name (ie...
what is the next closest place if a certain place is excluded). How would
this fit into the SQL statement.

Sorry for my stupidness.

Thanks

KenSheridan via AccessMonster.com said:
You can do it in a single query, restricting the result by means of a
subquery which returns the MIN distance per zipdode, e.g.

SELECT *
FROM [YourTable] AS T1
WHERE [Distance] =
(SELECT MIN([Distance])
FROM [YourTable] AS T2
WHERE T2.[ZipCode] = T1.[ZipCode]);

By giving the two instances of the table aliases T1 and T2 th subquery can be
correlated with the outer query on ZipCode so that it returns the MIN
distance per zipcode. The outer query is then restricted to those rows where
the distance equals the value returned by the subquery.

BTW the FIRST operator doesn't do what you might think. Its really only of
use where you want a random (in terms of values) row per grouping returned as
a table is a set, and sets have no intrinsic order, so concepts like first
and last are pretty meaningless. To return the highest or lowest values the
MIN and MAX operators should be used.

Ken Sheridan
Stafford, England
hi,

I have a very annoying problem that i can't seem to solve, so i was
wondering if any one can help.

Basically, i have a table with 139,084 rows, this table is made contains 11
rows each per zipcode, each row contains a zipcode, a place name, a distance
value (from zipcode to place name) and a population for that zipcode. For
each zipcode I need to find the row associated with shortest distance and
return that row only, hence in the end only be left with 12,644 unique rows
(139,084/11).

The way i have approached this is by making a make table query based on the
original table which sorts the data by Zipcode (asc) and then by Distance
(asc).
Then using a second query, i query this table, setting the criteria to
'First' on all the fields except the zipcode (and obviously the distance
field). On the face of it this appears to work and by eye it seems to work,
and i get the first one back for each zipcode, which i then sum the
populations.

HOWEVER, for some reason i get different results each time i run the set of
queries (or macro) only very slight but different everytime, I've tried and
looked at everything but this one has me beat.

Can anyone suggest any reasons or potential new approaches i can use !!

thanks very much, i'm just about to pull the last hair out my head.

S

--
Message posted via AccessMonster.com


.
 
You might be able to speed things up considerably if you use two queries.

Query One - saved as qMinDistance:
SELECT FromID, MIN([Dist]) as MinDist
FROM [T01a_Distance To Each Site With Population_sorted]
GROUP BY FromID

Query Two:
SELECT T1.*
FROM [T01a_Distance To Each Site With Population_sorted] AS T1
INNER JOIN qMinDistance
ON T1.FromID = qMinDistance.FromID
AND T1.Dist = qMinDistance.MinDist

You should be able to convert query two to a make table query. I would not in
most cases do so, but if the results are stable and you need the performance
increase it might be advisable. Of course, I would create the destination
table first and use an append query. That way I could have indexes on the
necessary fields for better performance.

As for question 2, you might be able to solve it using a where clause on the
qMinDistance query.
SELECT FromID, MIN([Dist]) as MinDist
FROM [T01a_Distance To Each Site With Population_sorted]
WHERE [PlaceName] <> 'Holy Cow'
GROUP BY FromID

Of course, if there are many place names to exclude you would need a table of
excluded place names and use a join to exclude the names. Something like the
following.

qMinDistance query.
SELECT FromID, MIN([Dist]) as MinDist
FROM [T01a_Distance To Each Site With Population_sorted] as T1
LEFT JOIN tblExclusions as T2
On T1.PlaceName = T2.PlaceName
WHERE T2.PlaceName is Null
GROUP BY FromID


John Spencer
Access MVP 2002-2005, 2007-2010
The Hilltop Institute
University of Maryland Baltimore County
sorry it might be useful if i post the current SQL statement;

SELECT *
FROM [T01a_Distance To Each Site With Population_sorted] AS T1
WHERE [Dist] =
(SELECT MIN([Dist])
FROM [T01a_Distance To Each Site With Population_sorted] AS T2
WHERE T2.[FromID] = T1.[FromID]);



sdg8481 said:
Thank Ken, this is brilliant and seems to be working (taking a while and
causes access to say not responding but is eventually working), and the text
about the 'First' criteria explains a lot on why i was getting slightly
different results.

I'm going to be very cheeky now though, and ask a couple more things.

1) is it possible to make this a Make-Table Query aswell, and
2) if i also need to add a criteria that excludes certain places name (ie...
what is the next closest place if a certain place is excluded). How would
this fit into the SQL statement.

Sorry for my stupidness.

Thanks
 
sdg8481 said:
Thank Ken, this is brilliant and seems to be working (taking a while and
causes access to say not responding but is eventually working), and the
text
about the 'First' criteria explains a lot on why i was getting slightly
different results.

I'm going to be very cheeky now though, and ask a couple more things.

1) is it possible to make this a Make-Table Query aswell, and
2) if i also need to add a criteria that excludes certain places name
(ie...
what is the next closest place if a certain place is excluded). How would
this fit into the SQL statement.

Sorry for my stupidness.

Thanks

KenSheridan via AccessMonster.com said:
You can do it in a single query, restricting the result by means of a
subquery which returns the MIN distance per zipdode, e.g.

SELECT *
FROM [YourTable] AS T1
WHERE [Distance] =
(SELECT MIN([Distance])
FROM [YourTable] AS T2
WHERE T2.[ZipCode] = T1.[ZipCode]);

By giving the two instances of the table aliases T1 and T2 th subquery
can be
correlated with the outer query on ZipCode so that it returns the MIN
distance per zipcode. The outer query is then restricted to those rows
where
the distance equals the value returned by the subquery.

BTW the FIRST operator doesn't do what you might think. Its really only
of
use where you want a random (in terms of values) row per grouping
returned as
a table is a set, and sets have no intrinsic order, so concepts like
first
and last are pretty meaningless. To return the highest or lowest values
the
MIN and MAX operators should be used.

Ken Sheridan
Stafford, England
hi,

I have a very annoying problem that i can't seem to solve, so i was
wondering if any one can help.

Basically, i have a table with 139,084 rows, this table is made contains
11
rows each per zipcode, each row contains a zipcode, a place name, a
distance
value (from zipcode to place name) and a population for that zipcode.
For
each zipcode I need to find the row associated with shortest distance
and
return that row only, hence in the end only be left with 12,644 unique
rows
(139,084/11).

The way i have approached this is by making a make table query based on
the
original table which sorts the data by Zipcode (asc) and then by
Distance
(asc).
Then using a second query, i query this table, setting the criteria to
'First' on all the fields except the zipcode (and obviously the distance
field). On the face of it this appears to work and by eye it seems to
work,
and i get the first one back for each zipcode, which i then sum the
populations.

HOWEVER, for some reason i get different results each time i run the set
of
queries (or macro) only very slight but different everytime, I've tried
and
looked at everything but this one has me beat.

Can anyone suggest any reasons or potential new approaches i can use !!

thanks very much, i'm just about to pull the last hair out my head.

S

--
Message posted via AccessMonster.com


.
 
Back
Top