Count distinct number of days

  • Thread starter Thread starter Zed
  • Start date Start date
Z

Zed

I need to be able to count the total distinct number of days in a given year
that a person received a service. Dates are provided by range. Any time the
person receives a service at a different location but on the same day, it
should only count as a single day. Any suggestions on the best way to
accomplish this? As an example, if Mary received services from December 30,
2008 through January 5, 2009 at location A, and also received services from
January 3 through January 6 at location B, and also received services from
March 1 through March 3 at location C, the total number of days in 2009
should be 9. The underlying data table can contain any number of these
records for a given person, so the solution has to be able to step through
and evaluate all possible combinations of date ranges. I'm using Access
2003. --Thanks for the help!
 
A possible solution involves a driver table: Have a table, Iotas, one field,
iota, with values from 0 to 999 (you can fill it from an import from Excel,
rather than typing 1000 values).


SELECT DISTINCT patientID, fromDate+iotas.iota AS onDay
FROM yourTableNameHere AS a INNER JOIN Iotas
ON a.fromDate+Iotas.iota <= a.upToDate
WHERE YEAR(fromDate+iotas.iota) = 2009


as saved query, q1, then, a second query:


SELECT patientID, COUNT(*)
FROM q1
GROUP BY patientID


should give what you want.

Note that I assumed your table has, among other, these fields names:
patientID, fromDate, upToDate



Vanderghast, Access MVP
 
Wow! What a great trick! Thank you... Had to tweak it a bit to get it
working, but that did the job. Also had to break it up into chunks of about
50K records (total 350K) or it would just freeze up on me. Do you know of
something that might work better on a larger data set? Arrays? Something
else? Thanks again... Saved me hours of spinning my wheels....
 
Depends of the nature of the data. If there are multiple records which are
not about year 2009, as example, then you can replace the yourTableNameHere
with a query which will only keep the wanted records:


SELECT patientID,
iif(fromDate < #1/1/2009#, #1/1/2009#) as starting,
iif(upToDate > #12/31/2009#, #12/31/2009#) as ending
FROM yourTableNameHere
WHERE fromDate <= #12/31/2009#
AND upToDate >= #1/1/2009#



With this query, which prune out irrelevant records (and dates), the second
query (the one already previously supplied) now based on this previous query
rather than directly on the table should execute a little bit faster. Note
that this query now uses starting and ending, as limits for dates, rather
than the initial fromDate and upToDate. Indeed, if someone is fromDate from
middle of July 2008 to the third of January 2009, it is useless to
'generate' the dates in 2008 and that is why the query replaces middle of
July 2008 by the first of January 2009. Simple optimization, but if if may
help, or not, that depends on the nature of the data itself...



Vandergahst, Access MVP
 
another option would be to decrease the size of the Iotas table to the
maximum number of days someone might receive service in a given instance. In
your example, the number of days between start and end was only about 5 days.
If that number is not going to be any greater than 10 or so.

I'm a stickler for working with dates, so I tend to avoid just adding
numbers to dates (although this is acceptable). I do it this way so that
when I go back and review my queries, I can understand exactly what my intent
was.

SELECT DISTINCT PatientID,
DateAdd("d", Iotas.iota, [FromDate]) as ServiceDate
FROM yourTable, Iotas
WHERE DateAdd("d", Iotas.iota, [FromDate]) < [ThruDate]
AND Year(DateAdd("d", Iotas.iota, [FromDate]) = 2009

I also used the WHERE clause to restrict the dates rather than the Inner
Join that Michel Walsh (Vanderghast) used because most people don't know how
to create a non-equi join. To do that, you have to either create the join in
the SQL view, or create an equi Join in the design view and then modify it in
the SQL view.
 
Back
Top