Calculate Average Date

  • Thread starter Thread starter Steve
  • Start date Start date
S

Steve

Hi all

How would I find out the average date when given a bunch of dates? For
example, I want to find the average length in time from the following dates:
----------------------------------------------------
Start Date End Date
01/01/2004 12:50pm 02/01/2004 18:40pm
02/01/2004 13:40pm 02/01/2004 13:57pm
02/01/2004 14:30pm 02/01/2004 19:50pm

Min = 17 Minutes (2nd entry)
Max = 1 Day 5 Hours 10 Minutes (1st entry)
Average = ??
-----------------------------------------------------
Also, I have a function that calculates the time difference between 2 dates,
but it doesn't work all that well if you take int account months, does
anyone have a good function that can take 2 date parameters and return the
time difference formatted as '? Months ? Days ? Minutes ? Hours ? Seconds'

Thanks for your help,

Kind Regards,
Steve.
 
Hi Steve,

This is a fine question for one of the languages groups

microsoft.public.dotnet.languages.csharp
and
microsoft.public.dotnet.languages.vb

They eat this stuff in those groups.

Cor
 
Hi Cor

I thought that this question would be language independant, but if you think
I will get a better/quicker response from these groups I will post it again
there.

Thank you.
Steve.
 
Steve said:
How would I find out the average date when given a bunch of dates? For
example, I want to find the average length in time from the following dates:
----------------------------------------------------
Start Date End Date
01/01/2004 12:50pm 02/01/2004 18:40pm
02/01/2004 13:40pm 02/01/2004 13:57pm
02/01/2004 14:30pm 02/01/2004 19:50pm

Min = 17 Minutes (2nd entry)
Max = 1 Day 5 Hours 10 Minutes (1st entry)
Average = ??
-----------------------------------------------------
Also, I have a function that calculates the time difference between 2 dates,
but it doesn't work all that well if you take int account months, does
anyone have a good function that can take 2 date parameters and return the
time difference formatted as '? Months ? Days ? Minutes ? Hours ? Seconds'

It sounds like you're actually after the average of a bunch of
*timespans*. You create a TimeSpan by subtracting one DateTime from
another, either using the Subtract method or (in C#) using the "-"
operator. (eg TimeSpan diff = end-start; )

You could then (for instance) add the number of ticks in each timespan,
divide by the number of timespans, and then create a new timespan for
the new value.

You can find the min/max in the same way as for anything else, just
using comparisons (either using CompareTo or (in C#) the comparison
operators).
 
Hi Jon

Thanks for your quick response. Just to confirm, is the following what you
would recommend?
-----------------------------------------
Dim myTimeSpan As New TimeSpan
Dim myTicks As Long

For row = 0 to rowCount -1 '// Loop through the DataGrid rows and get the
Start/End Dates
myTimeSpan = EndDate.Subtract(StartDate)
myTicks += myTimeSpan.Ticks
Next

myTicks = myTicks/rowCount
Dim myNewTimeSpan As New TimeSpan(myTicks)
Days = myNewTimeSpan.Days
Hours = myNewTimeSpan.Hours
Minutes= myNewTimeSpan.Minutes
Seconds= myNewTimeSpan.Seconds
---------------------------------------

That returns a good figure, just want to check that I have the logic
correct.

Your help is much appreciated, thank you.

Steve.
 
Steve said:
Thanks for your quick response. Just to confirm, is the following what you
would recommend?
-----------------------------------------
Dim myTimeSpan As New TimeSpan
Dim myTicks As Long

For row = 0 to rowCount -1 '// Loop through the DataGrid rows and get the
Start/End Dates
myTimeSpan = EndDate.Subtract(StartDate)
myTicks += myTimeSpan.Ticks
Next

myTicks = myTicks/rowCount
Dim myNewTimeSpan As New TimeSpan(myTicks)
Days = myNewTimeSpan.Days
Hours = myNewTimeSpan.Hours
Minutes= myNewTimeSpan.Minutes
Seconds= myNewTimeSpan.Seconds

That looks fine to me, yes.
 
Steve said:
I thought that this question would be language independant, but if you think
I will get a better/quicker response from these groups I will post it again
there.

No, you were right. The only difference involved here is really whether
or not you can use the various overloaded operators.
 
Hi Jon,

In this newsgroups are less programmers active, I know that in the VB
language group are a lot of people who have fun to help with this questions.
You can say I was guessing that it was with the csharp group the same.

Cor
 
Hi Steve

The question is language independent but in the language groups are people
who love to do this just as a challenge.

In past was Fergus Cooney active in the vb.language group, he would stay up
all night to get the best solutions for things like this.

That was why I told it.

Cor
 
Cor said:
In this newsgroups are less programmers active, I know that in the VB
language group are a lot of people who have fun to help with this questions.
You can say I was guessing that it was with the csharp group the same.

It is - but I don't actually think either of those groups is
particularly suitable, as the question wasn't really about languages. I
would say that here and microsoft.public.dotnet.framework are the best
groups for the question, personally.
 
Hi Steve,

I think this is not the good approach

I would make it more like this (I did not real check it but I thinks it
reached it very far)

Module main
Public Sub main()
Dim startdates() As String = _
{"01/01/2004 12:50", "02/01/2004 13:40", "02/01/2004 14:30"}
Dim enddates() As String = _
{"02/01/2004 18:40", "02/01/2004 13:57", "02/01/2004 19:50"}
Dim span(2) As TimeSpan
Dim totalspan As TimeSpan
For i As Integer = 0 To 2
span(i) = CDate(enddates(i)).Subtract(CDate(startdates(i)))
totalspan = totalspan.Add(span(i))
Dim a As String = totalspan.ToString
Next
span.Sort(span)
MessageBox.Show("min = " & span(0).ToString & vbCrLf & _
"max = " & span(2).ToString & vbCrLf & _
"avarage = " & totalspan.FromSeconds _
(totalspan.TotalSeconds / 3).ToString())
End Sub
End Module

In this way you use the features from the Net framework.

But as I said, I think there are better solutions.

Cor
 
Cor said:
I think this is not the good approach

I would make it more like this (I did not real check it but I thinks it
reached it very far)

Module main
Public Sub main()
Dim startdates() As String = _
{"01/01/2004 12:50", "02/01/2004 13:40", "02/01/2004 14:30"}
Dim enddates() As String = _
{"02/01/2004 18:40", "02/01/2004 13:57", "02/01/2004 19:50"}
Dim span(2) As TimeSpan
Dim totalspan As TimeSpan
For i As Integer = 0 To 2
span(i) = CDate(enddates(i)).Subtract(CDate(startdates(i)))
totalspan = totalspan.Add(span(i))
Dim a As String = totalspan.ToString
Next
span.Sort(span)
MessageBox.Show("min = " & span(0).ToString & vbCrLf & _
"max = " & span(2).ToString & vbCrLf & _
"avarage = " & totalspan.FromSeconds _
(totalspan.TotalSeconds / 3).ToString())
End Sub
End Module

In this way you use the features from the Net framework.

Why is that better than the version the OP posted? It gives the min and
the max, but requires the whole list to be sorted (which is
unnecessary) in order to do so. It also does more conversions for no
reasons, as far as I can see.
 
Hi Jon,
Why is that better than the version the OP posted? It gives the min and
the max, but requires the whole list to be sorted (which is
unnecessary) in order to do so. It also does more conversions for no
reasons, as far as I can see.

Give a shorter sample in whatever language.

But complete running with the start as the OP told, than you are a man and
show that you know it better. Now I do not tell for what this sounds for me.

It are only 14 lines of code inlcluding the module name, (there is one test
line stayed in it).

If you do not show it I expect that you cannot.

Cor
 
Difference between two dates is easy, if they are DateTime objects:

TimeSpan t = dtFinish - dtStart;
Debug.WriteLine(t);

How's that grab you?

--Tim
 
Cor said:
Give a shorter sample in whatever language.

Steve's example already *was* shorter. However, shorter is not
necessarily better anyway.
But complete running with the start as the OP told, than you are a man and
show that you know it better. Now I do not tell for what this sounds for me.
It are only 14 lines of code inlcluding the module name, (there is one test
line stayed in it).

(Actually, the code you posted was 20 lines, but that's not important.)

Your code included parsing each date/time, when there was no indication
that that was actually needed. Your version also included a sort for no
reason, which made the complexity of the function the complexity of the
sort, which is likely to be O(n log n), instead of O(n) which is all
that's needed.

The one benefit your way has (which is indeed a benefit) is that
there's no need to actually take the ticks each time - just sum the
timespans and get the ticks at the end. (Using ticks is a good idea
though, rather than seconds, as the accuracy will be better. For
instance, your method would not get a good average when all the times
are in the sub-second range...)

Here is a more general version in C#:

static void CompareDifferences (IList starts, IList ends,
out TimeSpan avg, out TimeSpan min,
out TimeSpan max)
{
min = TimeSpan.MaxValue;
max = TimeSpan.MinValue;

TimeSpan sum = new TimeSpan();

for (int i=0; i < starts.Count; i++)
{
TimeSpan diff = (DateTime)ends - (DateTime)starts;
sum += diff;
if (diff < min)
min=diff;
if (diff > max)
max=diff;
}
avg = new TimeSpan (sum.Ticks/starts.Count);
}

To me, that's more readable code. It's general (taking lists of
starts/ends, returning the three timespans as out parameters - that
could be done in a different way if required, of course) and it also
performs well.

In production code I'd also do argument verification, of course.
 
Hi Jon,

That code you supplied did not show anything on my screen when I pasted it
in my C# IDE as a blank C# project. (To be true I did not because I knew
that before)

Your long part of code should gives the same as the little piece here (I
could yours not test because it is incomplete) And strange enough you use
the same amount of casting, while you said there was no need for it. (And I
could use milliseconds in the span, but I did not, because the date was
supplied in minutes, so seconds would me more than precise)

\\\
public function Avaragetime (byval startdate() as string, byval enddate() as
array) as string
dim totalspan as timespan
dim span(2) as timespan
For i As Integer = 0 To 2
span(i) = CDate(enddates(i)).Subtract(CDate(startdates(i)))
totalspan = totalspan.Add(span(i))
Next
return totalspan.FromSeconds (totalspan.TotalSeconds / 3).ToString())
end function
////
 
Cor said:
That code you supplied did not show anything on my screen when I pasted it
in my C# IDE as a blank C# project. (To be true I did not because I knew
that before)

Your long part of code should gives the same as the little piece here (I
could yours not test because it is incomplete)

The only *conversion* my version does is casting (which would be an
unboxing operation). Yours does parsing, even though the OP never
mentioned the originals being in a string form. No matter though.
And strange enough you use
the same amount of casting, while you said there was no need for it. (And I
could use milliseconds in the span, but I did not, because the date was
supplied in minutes, so seconds would me more than precise)

\\\
public function Avaragetime (byval startdate() as string, byval enddate() as
array) as string
dim totalspan as timespan
dim span(2) as timespan
For i As Integer = 0 To 2
span(i) = CDate(enddates(i)).Subtract(CDate(startdates(i)))
totalspan = totalspan.Add(span(i))
Next
return totalspan.FromSeconds (totalspan.TotalSeconds / 3).ToString())
end function
////

No, that then doesn't do what my version does - it doesn't work out the
maximum and minimum. It also still hasn't been fixed to actually use
the length of the arrays in question - it blindly assumes that there
are 3 elements.

I'm still wondering why on earth you thought Steve's earlier code "is
not the good approach"... what did you have against it, exactly? The
only real difference is that it builds up the number of ticks instead
of as a timespan.
 
Hi Jon,
I'm still wondering why on earth you thought Steve's earlier code "is
not the good approach"... what did you have against it, exactly? The
only real difference is that it builds up the number of ticks instead
of as a timespan.
Exactly and the timespan is made for it.
I do not use anymore a multiple add to multiply also.

I even forgot this, it can even be brought down till this beneath
(I also included your sentence about the fixed array, although I had the
idea, that you should know that that is not important for this sample)
\\\
public function Avaragetime (byval startdate() as string, byval enddate() as
array) as string
dim totalspan as timespan
For i As Integer = 0 To startdate.lenght - 1
totalspan =CDate(enddates(i)).Subtract(CDate(startdates(i)))
Next
return totalspan.FromSeconds (totalspan.TotalSeconds /
startdate.lenght).ToString())
end function
///
Although in a real situation I probably would use a datatable if it was a
databound operation or else an arraylist or a self build from Ilist derived
collection, that hold items, object, properties whatever in that situation:
startdate, enddate, timespan (the last to be able to sort).

And the string was given in the sample from Steve.

Cor
 
Exactly and the timespan is made for it.

Either way seems fine to me, to be honest.
I do not use anymore a multiple add to multiply also.

Not sure what you mean by this...
I even forgot this, it can even be brought down till this beneath
(I also included your sentence about the fixed array, although I had the
idea, that you should know that that is not important for this sample)
\\\
public function Avaragetime (byval startdate() as string, byval enddate() as
array) as string
dim totalspan as timespan
For i As Integer = 0 To startdate.lenght - 1
totalspan =CDate(enddates(i)).Subtract(CDate(startdates(i)))
Next
return totalspan.FromSeconds (totalspan.TotalSeconds /
startdate.lenght).ToString())
end function
///

Did you test this? It doesn't seem to make much sense to me, as you're
no longer adding anything - effectively you end up with the total span
being *exactly* the last span.

Of course, it's still less readable than my version as far as I'm
concerned. I'm worried that you seem to think that code which is
shorter is necessarily better. Here's a version of my method which does
everything including finding the min and max, to the finest accuracy
possible (with an integral number of ticks), efficiently, but in only 6
lines of code. Amazing! ... but totally unreadable:

static void CompareDifferences (IList starts, IList ends,
out TimeSpan avg, out TimeSpan min, out TimeSpan max) {
min=TimeSpan.MaxValue;max=TimeSpan.MinValue;TimeSpan s=new
TimeSpan(); for (int i=0;i<starts.Count;i++){TimeSpan
d=(DateTime)ends-(DateTime)starts;s+=d;if(d<min)min=d;
if(d>max)max=d;}avg=new TimeSpan (s.Ticks/starts.Count);}
Although in a real situation I probably would use a datatable if it was a
databound operation or else an arraylist or a self build from Ilist derived
collection, that hold items, object, properties whatever in that situation:
startdate, enddate, timespan (the last to be able to sort).

And the string was given in the sample from Steve.

Well of course Steve gave examples in text, as there's no other way of
posting to the newsgroup - but his example code showed that he'll be
getting his dates from a DataGrid, so there was no need to do any
string parsing.

The main objection I had to your original code was that it was very
inefficient *and* it changed the input data, both due to sorting the
array - when sorting the array is (IMO) a far less natural way of
finding the min and the max values than just checking on each
iteration.
 
Back
Top