beginning and end of event sequences

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I am trying to figure out how to determine the beginning and end of sequences
of events. I have a field (Report_Date) in my table, and I want to identify
all of the sequences of dates; something like the following. I've tried a
variety of techniques, but just cannot seem to wrap my hands around this one.

Report_Date
5/1/04
5/2/04
5/3/04
5/5/04
5/7/04
5/8/04
5/9/04

Output would look like
Start_Date End_Date
5/1/04 5/3/04
5/5/04 5/5/04
5/7/04 5/9/04
 
Hi,


Rank your value and group by value minus rank.


Well, that nice to say, but that speaks to you only once you have the
solution...:-) So, unless mistake, here what should work:


SELECT MIN(report_date), MAX(report_date)
FROM myTable As a
GROUP BY Report_date - ( SELECT COUNT(*)
FROM myTable As b
WHERE b.report_date <=
a.report_date)




How it works? your date is, indeed, an integer (since you have no TIME
associated to it) and a rank is also an integer, but the rank is without
hole (I assume there is no duplicated values in the date), so, value minus
rank will make a "natural" group for each "sequence". It is then a matter
to find the min and the max in each sequence.

date value rank value - rank
5/1/04 38108 1 38107
5/2/04 38109 2 38107
5/3/04 38110 3 38107
5/5/04 38112 4 38108
5/7/04 38114 5 38109
5/8/04 38115 6 38109
5/9/04 38116 7 38109


so, we have three groups, 38107, 38108 and 38109. These numbers do not
represent dates anymore, but are useful as "sequence label" (kid of). Min()
and Max() of date supply the starting and ending of these sequences.




Hoping it may help,
Vanderghast, Access MVP
 
Hi Dale,

I think that the following will work, although you may need to adjust it a
little if some of your date entries also have a time component (or if they
may in the future). Obviously you will need to substitute your actual table
name as well.

SELECT YourTable.Report_Date AS Start_Date, (SELECT Min(VT.Report_Date)
FROM YourTable AS VT
WHERE (VT.[Report_Date]+1) Not In (SELECT VT2.Report_Date FROM YourTable AS
VT2) AND VT.Report_Date >= YourTable.Report_Date) AS End_Date
FROM YourTable
WHERE ([Report_Date]-1) Not In (SELECT VT3.Report_Date FROM YourTable AS VT3);

Post back if it doesn't work.

-Ted Allen
 
Hi Michel,

Wow, that is a much simpler approach than the sql that I had posted. I
liked it so much that I wanted to try it on a test and couldn't get it to
run. I kept getting a syntax error on the group by so I checked the
developer's handbook to see what I may be doing wrong and noticed a note that
Group By's can't use aggregate functions. So, out of curiosity I tried a
DCount() instead and it worked.

I'm not very experienced with Group By's that use this type of expression.
Do you think that I may have made an error in writing the sql, or is it
really true that the Group By will take a DCount() but not a Count()?

The following appears to work (unfortunately I did not copy the exact sql of
the one that didn't work, but it was pretty much the same but using the
subquery with Count(*), in fact you can see that the table alias was left
over):

SELECT Min(a.Report_Date) AS Start_Date, Max(a.Report_Date) AS End_Date
FROM YourTable AS a
GROUP BY a.[Report_Date]-DCount("Report_Date","YourTable","Report_Date <= #"
& a.[Report_Date] & "#");

Any info you have would be appreciated. I always love learning new things.

-Ted Allen
 
Hi,


Indeed, I short-circuited the syntax. The easiest way is to first build
a query that computes the rank, then, in a second query, use the first one
where we would GROUP BY value-rank :


SELECT f1, (SELECT COUNT(*) FROM table As b WHERE b.f1<= a.f1) As Rank
FROM table As a

--saved as Q1


then


SELECT min(f1), max(f1)
FROM q1
GROUP BY f1-rank




We can combine the two steps in one, with a virtual table:

SELECT Min(z.f1) AS Starting, Max(z.f1) AS Ending
FROM (SELECT f1, (SELECT COUNT(*)
FROM Sequences as b
WHERE b.f1 <= a.f1) As Rank
FROM Sequences As a ) AS z
GROUP BY f1-rank;





Hoping it may help,
Vanderghast, Access MVP



Ted Allen said:
Hi Michel,

Wow, that is a much simpler approach than the sql that I had posted. I
liked it so much that I wanted to try it on a test and couldn't get it to
run. I kept getting a syntax error on the group by so I checked the
developer's handbook to see what I may be doing wrong and noticed a note
that
Group By's can't use aggregate functions. So, out of curiosity I tried a
DCount() instead and it worked.

I'm not very experienced with Group By's that use this type of expression.
Do you think that I may have made an error in writing the sql, or is it
really true that the Group By will take a DCount() but not a Count()?

The following appears to work (unfortunately I did not copy the exact sql
of
the one that didn't work, but it was pretty much the same but using the
subquery with Count(*), in fact you can see that the table alias was left
over):

SELECT Min(a.Report_Date) AS Start_Date, Max(a.Report_Date) AS End_Date
FROM YourTable AS a
GROUP BY a.[Report_Date]-DCount("Report_Date","YourTable","Report_Date <=
#"
& a.[Report_Date] & "#");

Any info you have would be appreciated. I always love learning new
things.

-Ted Allen

Michel Walsh said:
Hi,


Rank your value and group by value minus rank.


Well, that nice to say, but that speaks to you only once you have the
solution...:-) So, unless mistake, here what should work:


SELECT MIN(report_date), MAX(report_date)
FROM myTable As a
GROUP BY Report_date - ( SELECT COUNT(*)
FROM myTable As b
WHERE b.report_date <=
a.report_date)




How it works? your date is, indeed, an integer (since you have no TIME
associated to it) and a rank is also an integer, but the rank is without
hole (I assume there is no duplicated values in the date), so, value
minus
rank will make a "natural" group for each "sequence". It is then a
matter
to find the min and the max in each sequence.

date value rank value - rank
5/1/04 38108 1 38107
5/2/04 38109 2 38107
5/3/04 38110 3 38107
5/5/04 38112 4 38108
5/7/04 38114 5 38109
5/8/04 38115 6 38109
5/9/04 38116 7 38109


so, we have three groups, 38107, 38108 and 38109. These numbers do
not
represent dates anymore, but are useful as "sequence label" (kid of).
Min()
and Max() of date supply the starting and ending of these sequences.




Hoping it may help,
Vanderghast, Access MVP
 
Thanks, Michel.

The concept of using the value minus its rank as a grouping critieria would
never have occurred to me. Works like a charm.
 
Thanks Michel. It was interesting though to see that the original one-query
syntax that you had would work with DCount() in place of Count. I'll have to
try to remember that in the future (unfortunately the odds may not be too
good with my memory though).

Michel Walsh said:
Hi,


Indeed, I short-circuited the syntax. The easiest way is to first build
a query that computes the rank, then, in a second query, use the first one
where we would GROUP BY value-rank :


SELECT f1, (SELECT COUNT(*) FROM table As b WHERE b.f1<= a.f1) As Rank
FROM table As a

--saved as Q1


then


SELECT min(f1), max(f1)
FROM q1
GROUP BY f1-rank




We can combine the two steps in one, with a virtual table:

SELECT Min(z.f1) AS Starting, Max(z.f1) AS Ending
FROM (SELECT f1, (SELECT COUNT(*)
FROM Sequences as b
WHERE b.f1 <= a.f1) As Rank
FROM Sequences As a ) AS z
GROUP BY f1-rank;





Hoping it may help,
Vanderghast, Access MVP



Ted Allen said:
Hi Michel,

Wow, that is a much simpler approach than the sql that I had posted. I
liked it so much that I wanted to try it on a test and couldn't get it to
run. I kept getting a syntax error on the group by so I checked the
developer's handbook to see what I may be doing wrong and noticed a note
that
Group By's can't use aggregate functions. So, out of curiosity I tried a
DCount() instead and it worked.

I'm not very experienced with Group By's that use this type of expression.
Do you think that I may have made an error in writing the sql, or is it
really true that the Group By will take a DCount() but not a Count()?

The following appears to work (unfortunately I did not copy the exact sql
of
the one that didn't work, but it was pretty much the same but using the
subquery with Count(*), in fact you can see that the table alias was left
over):

SELECT Min(a.Report_Date) AS Start_Date, Max(a.Report_Date) AS End_Date
FROM YourTable AS a
GROUP BY a.[Report_Date]-DCount("Report_Date","YourTable","Report_Date <=
#"
& a.[Report_Date] & "#");

Any info you have would be appreciated. I always love learning new
things.

-Ted Allen

Michel Walsh said:
Hi,


Rank your value and group by value minus rank.


Well, that nice to say, but that speaks to you only once you have the
solution...:-) So, unless mistake, here what should work:


SELECT MIN(report_date), MAX(report_date)
FROM myTable As a
GROUP BY Report_date - ( SELECT COUNT(*)
FROM myTable As b
WHERE b.report_date <=
a.report_date)




How it works? your date is, indeed, an integer (since you have no TIME
associated to it) and a rank is also an integer, but the rank is without
hole (I assume there is no duplicated values in the date), so, value
minus
rank will make a "natural" group for each "sequence". It is then a
matter
to find the min and the max in each sequence.

date value rank value - rank
5/1/04 38108 1 38107
5/2/04 38109 2 38107
5/3/04 38110 3 38107
5/5/04 38112 4 38108
5/7/04 38114 5 38109
5/8/04 38115 6 38109
5/9/04 38116 7 38109


so, we have three groups, 38107, 38108 and 38109. These numbers do
not
represent dates anymore, but are useful as "sequence label" (kid of).
Min()
and Max() of date supply the starting and ending of these sequences.




Hoping it may help,
Vanderghast, Access MVP




I am trying to figure out how to determine the beginning and end of
sequences
of events. I have a field (Report_Date) in my table, and I want to
identify
all of the sequences of dates; something like the following. I've
tried a
variety of techniques, but just cannot seem to wrap my hands around
this
one.

Report_Date
5/1/04
5/2/04
5/3/04
5/5/04
5/7/04
5/8/04
5/9/04

Output would look like
Start_Date End_Date
5/1/04 5/3/04
5/5/04 5/5/04
5/7/04 5/9/04
 
Back
Top