Help with GROUP BY and COUNT

  • Thread starter Thread starter Nick HK
  • Start date Start date
N

Nick HK

I have a table of SequenceCodes and a table of Colours, both in a
many-to-many relation in the table SequenceCodes_Colours
</sample data
(ID's used in the table, but values displayed for clarity)

ID SequenceCodeID ColourID
3 00 Clear
4 06 Blue
5 06 Green
6 0B Blue
7 0B Green
8 0B Red

/sample data>

Trying to see if a particular colour combination is already used, returning
that SequenceCodeID, excluding combinations that also have more colours than
those requested.

<SQL

SELECT TheseColours.TheseCodes
FROM

((SELECT SequenceCodes.Code AS AllCodes, COUNT(SequenceCodes.Code) as
TotalAllColours
FROM SequenceCodes INNER JOIN (Colours INNER JOIN SequenceCodes_Colours ON
Colours.ID = SequenceCodes_Colours.ColourID) ON SequenceCodes.ID =
SequenceCodes_Colours.SequenceCodeID
GROUP BY SequenceCodes.Code)
AS AllColours),

((SELECT SequenceCodes.Code AS TheseCodes, COUNT(SequenceCodes.Code) as
TotalTheseColours
FROM SequenceCodes INNER JOIN (Colours INNER JOIN SequenceCodes_Colours ON
Colours.ID = SequenceCodes_Colours.ColourID) ON SequenceCodes.ID =
SequenceCodes_Colours.SequenceCodeID
WHERE ((Colours.EName="Blue") Or (Colours.EName="Green"))
GROUP BY SequenceCodes.Code)
AS TheseColours)

WHERE TheseColours.TotalTheseColours= AllColours.TotalAllColours

/SQL>

This still returns "06" and "OB", rather than excluding "OB" (where
AllColours.TotalAllColours=3).

I seem to remember seeing an example of this before, but the correct SQL
eludes me.

TIA

NickHK
 
Dear Nick:

For a simple looking problem, this is going to be interesting!

Before I dive in, two things stand out to me.

First, I'd really like to have a unique constraint (index) on the
composite key SequenceCodeID / ColourID. I am thinking we don't want
to allow a colour to occur twice in the same sequence. Is this good
thinking?

Second, I want to know about the rules a little more. It seems you
want to show sequence 06 in the result, apparently because sequence 0B
subsumes its colour sequence. Is this the case?

As to a solution, in general terms, I'm thinking of using a COUNT() of
the matching colours. When this count of matching colours equals the
count of colours in the target sequence, then that target sequence is
a subset of the other sequence and should be listed. This makes sense
only if the Sequence / Colour sets are unique, as stated before.

If this makes sense so far, please get back to me, or correct my
understanding so we can proceed to code it.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts
 
Nick-

If I understand what you're trying to do, you want the sequence code IDs
that have the selected colors (colours) and ONLY those colors.

Try this:

SELECT SequenceCodeID
FROM SequenceCodes_Colours
WHERE SequenceCode_Colours.SequenceCodeID IN
(SELECT SequenceCodes_Colours.SequenceCodeID
FROM SequenceCodes_Colours As SC1
INNER JOIN Colours
ON SC1.ColourID = Colours.ColourID
WHERE Colours.EName = "Blue" OR Colours.EName = "Green"
GROUP BY SequenceCodeID
HAVING Count(*) =
(SELECT Count(*)
FROM SequenceCodes_Colours As SC2
WHERE SC2.SequenceCodeID = SequenceCodes_Colours.SequenceCodeID))

Access might not like the outer-outer reference from the innermost subquery
(even tho this is valid ANSI-SQL).

Your query might work if you INNER JOIN AllColours with TheseColours on
SequenceCodeID.

--
John Viescas, author
"Microsoft Office Access 2003 Inside Out"
"Running Microsoft Access 2000"
"SQL Queries for Mere Mortals"
http://www.viescas.com/
(Microsoft Access MVP since 1993)
http://www.deanforamerica.com/site/TR?pg=personal&fr_id=1090&px=1434411
 
Dear Nick:

I see John has coded just what I was thinking. The only thing is,
this works only if Sequence / Colour is unique, as I pointed out.
Otherwise, you would need to use a SELECT DISTINCT subquery of the
table so as to not count duplicated colours (if and when they exist).
As I see it, duplicates will cause the colours to count twice when
they exist twice.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts
 
Greetings,

You can simplify problems like this and get
rid of a lot of convoluted sql by making extended
use of the Access crosstab query.(And Access does
not support Count(Distinct..):(:).

Assuming your posted data is in table nick.

Use the crosstab query to compare the total count
to just the colours your interested in.

Query Qnick1 (remove comments to run):

-- Use colour for both pivot columns and the cells of the crosstab.
TRANSFORM Max(ColourID) AS Colour
SELECT SequenceCodeID,
-- Count the Colours for each SequenceCodeID.Note that this
-- is a Distinct count so if you have duplicate colours you still
-- get the correct (distinct) number of colours per Sequence.
-- This is why Max(ColourID) was used as the Transform.Using the
-- Transform alias Colour in the Count aggregate is making use
-- of the extended functionality of the crosstab query.
Count(Colour) AS TotalAllColours,
-- Now count the occurrances of Blue and Green for each Sequence.
-- Similar to total count of colours (above) but we add the condition that
-- the colour (cell value for each pivot column) be either Blue or Green.
-- Any other colour will be treated as a Null in count and hence
-- not counted:).
Count(Switch(Colour In ('Blue','Green'),1)) AS TheseColours,
-- Now we created a column yes/no based on comparing the 2 above counts.
-- Note the constant comparison need not be wrapped in an aggregate
-- function (see below).
Switch(TotalAllColours=TheseColours,'YES',True,'NO') AS Colourtest
FROM nick
GROUP BY SequenceCodeID
-- If you want to see the pivot columns remove 'In (Null)'.
-- This may help by visually understanding what is happening here:).
-- By using 'In (Null)' the query can process any number of
-- colours.In other words, the 255 column limit (colours) is eliminated.
-- For this problem we aren't interested in a 'crosstab' per sa.
-- We setting up the data in crosstab to make use of this querys
-- magically properties:).This is just the tip of the iceberg.Not only
-- can the cell values (Transform/alias) be referenced but the pivot columns
-- can also be referenced!One need only use an aggregate function as
-- a wrapper (just like in a Select/Group By query) to make it legal.
PIVOT ColourID In (Null);

Result:

SequenceCodeID TotalAllColours TheseColours Colourtest <>
00 1 0
NO
06 2 2
YES
0B 3 2
NO

Get it?:)
Add more data including duplicates to the nick table and play with it.
[Compare this (if you can:) to the ansi sql mess you would
have to write in Access to duplicate it.]

Now get a list of SequenceCodeID and ColourID
using the nick table and the Qnick1 query making
sure that Colourtest is 'YES'.Use Distinct to
eliminate any duplicate Sequence/Colour combinations.
Now your back to simple/basic sql:).

Query Qnick2:

SELECT Distinct a.SequenceCodeID,a.ColourID
FROM nick as a inner join Qnick1 as b on a.SequenceCodeID=b.SequenceCodeID
Where Colourtest='YES';

Result:

SequenceCodeID ColourID
06 Blue
06 Green

For crosstabs that go way beyond Access and much more
on MS Sql Server (any version) check out the RAC utility.
For a free front end to using the MSDE engine check out QALite.
www.rac4sql.net
 
Thanks to all.
John pointed me in the right direction with the use of 'IN".
Sequence/Colour is unique, as are the combinations of Colours. As I'm
building the SQL string from within VB, I know what the COUNT value should
equal, which simplifies it somewhat.
I haven't had chance to look at your cross-tab approach Zeppo, but looks
interesting.
Anyway this query does what I need.

SELECT SC1.SequenceCodeID
FROM SequenceCodes_Colours AS SC1
WHERE SC1.SequenceCodeID IN

(SELECT SequenceCodes_Colours.SequenceCodeID
FROM SequenceCodes
INNER JOIN (Colours
INNER JOIN SequenceCodes_Colours
ON Colours.ID = SequenceCodes_Colours.ColourID)
ON SequenceCodes.ID =SequenceCodes_Colours.SequenceCodeID

WHERE (Colours.EName="Green") OR (Colours.EName="Blue")

GROUP BY SequenceCodes_Colours.SequenceCodeID
HAVING COUNT(SequenceCodes_Colours.SequenceCodeID)=2)

GROUP BY SC1.SequenceCodeID
HAVING COUNT(SC1.SequenceCodeID)=2


NickHK


| Nick-
|
| If I understand what you're trying to do, you want the sequence code IDs
| that have the selected colors (colours) and ONLY those colors.
|
| Try this:
|
| SELECT SequenceCodeID
| FROM SequenceCodes_Colours
| WHERE SequenceCode_Colours.SequenceCodeID IN
| (SELECT SequenceCodes_Colours.SequenceCodeID
| FROM SequenceCodes_Colours As SC1
| INNER JOIN Colours
| ON SC1.ColourID = Colours.ColourID
| WHERE Colours.EName = "Blue" OR Colours.EName = "Green"
| GROUP BY SequenceCodeID
| HAVING Count(*) =
| (SELECT Count(*)
| FROM SequenceCodes_Colours As SC2
| WHERE SC2.SequenceCodeID = SequenceCodes_Colours.SequenceCodeID))
|
| Access might not like the outer-outer reference from the innermost
subquery
| (even tho this is valid ANSI-SQL).
|
| Your query might work if you INNER JOIN AllColours with TheseColours on
| SequenceCodeID.
|
| --
| John Viescas, author
| "Microsoft Office Access 2003 Inside Out"
| "Running Microsoft Access 2000"
| "SQL Queries for Mere Mortals"
| http://www.viescas.com/
| (Microsoft Access MVP since 1993)
| http://www.deanforamerica.com/site/TR?pg=personal&fr_id=1090&px=1434411
| | > I have a table of SequenceCodes and a table of Colours, both in a
| > many-to-many relation in the table SequenceCodes_Colours
| > </sample data
| > (ID's used in the table, but values displayed for clarity)
| >
| > ID SequenceCodeID ColourID
| > 3 00 Clear
| > 4 06 Blue
| > 5 06 Green
| > 6 0B Blue
| > 7 0B Green
| > 8 0B Red
| >
| > /sample data>
| >
| > Trying to see if a particular colour combination is already used,
| returning
| > that SequenceCodeID, excluding combinations that also have more colours
| than
| > those requested.
| >
| > <SQL
| >
| > SELECT TheseColours.TheseCodes
| > FROM
| >
| > ((SELECT SequenceCodes.Code AS AllCodes, COUNT(SequenceCodes.Code) as
| > TotalAllColours
| > FROM SequenceCodes INNER JOIN (Colours INNER JOIN SequenceCodes_Colours
ON
| > Colours.ID = SequenceCodes_Colours.ColourID) ON SequenceCodes.ID =
| > SequenceCodes_Colours.SequenceCodeID
| > GROUP BY SequenceCodes.Code)
| > AS AllColours),
| >
| > ((SELECT SequenceCodes.Code AS TheseCodes, COUNT(SequenceCodes.Code) as
| > TotalTheseColours
| > FROM SequenceCodes INNER JOIN (Colours INNER JOIN SequenceCodes_Colours
ON
| > Colours.ID = SequenceCodes_Colours.ColourID) ON SequenceCodes.ID =
| > SequenceCodes_Colours.SequenceCodeID
| > WHERE ((Colours.EName="Blue") Or (Colours.EName="Green"))
| > GROUP BY SequenceCodes.Code)
| > AS TheseColours)
| >
| > WHERE TheseColours.TotalTheseColours= AllColours.TotalAllColours
| >
| > /SQL>
| >
| > This still returns "06" and "OB", rather than excluding "OB" (where
| > AllColours.TotalAllColours=3).
| >
| > I seem to remember seeing an example of this before, but the correct SQL
| > eludes me.
| >
| > TIA
| >
| > NickHK
| >
| >
| >
|
|
 
Back
Top