Time Critical Process in .NET

  • Thread starter Thread starter Charles Law
  • Start date Start date
C

Charles Law

Hi guys

I have a time critical process, running on a worker thread. By "time
critical", I mean that certain parts of the process must be completed in a
specific time frame. The time when the process starts is not especially
important, but it must be complete within a small number of seconds.

The operations I am performing do not take a long time (hundreds of
milliseconds), but as each part of the process is complete, my worker thread
raises an event that is handled by the UI thread, to update a rich text
control with details of the completed operation. I am using a rich text box
so that when a failure occurs I can colour the message red.

The problem I have is that sometimes, for no apparent reason, a step in my
process takes an inordinate amount of time, e.g 2.5 seconds instead of
perhaps 300 ms. When this happens, the complete process overruns my time
frame, and it has to be repeated. When I repeat the process there is every
chance that it completes in a realistic time, and all is well. If I stop
outputting to the screen, I do not get the problem.

When updating the screen on the UI thread, I use BeginInvoke, to marshal the
operation to the correct thread, and so as not to hold up the worker thread,
but this does not seem to help.

I realise that Windows (XP in this case) is not the ideal o/s for this type
of application, but does anyone have any ideas about how I could make my
application more deterministic? I am not certain what is going on in these
2.5 seconds, so it might be useful if I could find out, but I am not sure
how I would do that.

TIA

Charles
 
Charles,

You get my almost standard answer (however a little bit more, so don't
direct stop reading).

When one thread depends from another thread than you can not use
multithreading (Or you should use optimistic multiprocessing however than we
come in a complete other area).

However when you are able to bring the dependend process to the
workerthread, than you have fulfiled again on this condition.

Cor
 
Hi Cor

I thought that I was removing the dependency by using BeginInvoke.

I need the critical process to be performed on a separate thread so that the
UI remains responsive, and of course I cannot update the screen on the
worker thread for the usual reason.

I was wondering if there was a way of making essential parts of my process
not interruptible, so that I know that they will always execute in one go,
without being switched out of context. Also, is there a way of stopping the
GC from doing its thing at key times, as I fear this may also be causing
arbitrary delays?

Charles
 
Cor Ligthert said:
Charles,

You get my almost standard answer (however a little bit more, so don't
direct stop reading).

When one thread depends from another thread than you can not use
multithreading (Or you should use optimistic multiprocessing however than
we come in a complete other area).

No!!! Doing lengthy calculations in a background thread and displaying the
results in a GUI thread is an absolutely common multithreading situation.
Almost any serious multithreading task I can think of involves some kind of
inter-thread communication/synchronization.

To the OP: You'll never be able to guarantee an upper limit of your
execution time, but that's mainly due to HD swapping, buggy drivers or
non-cooperative realtime processes. If none of that happens, your best bet
is to find out *what* happens. (But you already knew that).

It's hard to give you any better advices without more details about your
app, but I'd try the following:
- Print out the current time (using some high-res-timer) at some strategic
points to find out exactly *what* takes those 2.5 sec. Is it always the same
step in your background-thread operation? Is it the communication between
the threads?
- Monitor the .NET performance counters. Is there maybe some correlation to
GC collection or some other runtime event?
- Which GC do you use?
- 2.5 s is quite long, maybe you can interrupt your process in a debugger
while it happens?
- Can you remove/simplify parts of your application (like using some fake
calculation instead of the real one, or using a simple text box), to see if
the problem still happens?

Hope one of these helps...

Niki
 
Hi Charles,

Upping the priority of your worker thread to Time Critical may be
appropriate for what you want to accomplish, as it would theoretically stop
other threads from interrupting its progress. However, I suspect this would
leave you with an unresponsive GUI (which you want to avoid) as well as a
potentially unstable process.

Does this behaviour occur whilst running the process in its fully compiled
Release Mode, ie compiled without Debug symbols and run directly, not via
VS.NET? I've noticed that applications run within VS (even in Release Mode)
often hang for a few seconds if an exception is thrown, even if this
exception is handled by the CLR or your app. Bear in mind that some
routines throw exceptions as a matter of course such as IsDate or IsNumeric,
both of which try to cast a variable to their respective data-types and
return false if an exception is caught, albeit internally.

Cheers,
Alex Clark
 
Hi Alex

Thanks for the response. I have set the priority to Above Normal, but, as
you suggest, don't wish to raise it any further.
Does this behaviour occur whilst running the process in its fully compiled
Release Mode, ie compiled without Debug symbols and run directly, not via
VS.NET?

Yes, it does. I am running it on a laptop (Celeron processor), and I was
also wondering if the fact that it has an LCD screen means that screen
updates take longer. That said, I can also run the whole thing in debug on
another laptop (3.0 GHz P4) and not get these problems.

Charles
 
Hi Niki

Thanks for the response. I currently display the absolute and elapsed times
using a high res timer, which is how I can see the difference between the
expected and actual timings. I guess I could add more time displays, but it
would generate a lot of output, and whilst it might tell me when and where
in my code the excess time is taken, I am not sure it will necessarily tell
me why. I don't think it is my code (they all say that), because it is so
random.
- Monitor the .NET performance counters. Is there maybe some correlation
to GC collection or some other runtime event?

How can I tell when garbage collection occurs?
- Which GC do you use?

How many are there? I thought there was just the one, built-in to the
framework.

Charles
 
Hi,

As you imagine, there may be "other things" affecting your timing. Notebook
computer power saving modes are notorious for causing unpredictable
behavior(s) of this sort. Make sure that you have configured those to 'high
power" or whatever setting might be equivalent -- and see if the problem
persists.

I have a Toshiba notebook, configured to "never hibernate," and to stay in
full-power mode all the time, except when on battery However... It does
enter power saving mode, which screws things up. If I were to guess, I'd
say that you might be seeing a similar problem.

Is there any possibility that there is some other application or service on
your test system that may be causing some sort substantial delay?

Dick

--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 4th
Edition ISBN 1-890422-28-2 (391 pages) published July 2004. See
www.mabry.com/vbpgser4 to order.
 
Charles,
This sounds like the ideal use of multiple threads!

I'm not sure what Cor was smoking when he responded :-|

However due to the nature of .NET (non-deterministic garbage collection) &
Windows preemptive scheduling; "real-time" critical processes are hard to
achieve.

Can you use CLR profiler or custom performance counters to identify that a
Garbage Collection is not occurring during this "critical" process? Can you
redesign the critical process to minimize garbage collections if this is the
problem?

Are you certain that windows did not suspend your program in favor of
allowing another program to run?

Have you tried using Thread.Priority and/or Process.PriorityClass &
Process.PriorityBoostEnabled so Windows will favor your thread/process?

WARNING: I would use Process.PriorityClass & Process.PriorityBoostEnabled
with extreme caution!

Info on the CLR Profiler:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/scalenethowto13.asp

http://msdn.microsoft.com/library/d...y/en-us/dndotnet/html/highperfmanagedapps.asp

Hope this helps
Jay
 
Jay

Something as this.
http://reports-archive.adm.cs.cmu.edu/anon/2005/CMU-CS-05-118.pdf

I just took it from Google, I saw that a little bit is written in it what I
mean, although this is about a database is this always when the main thread
*needs* its information earlier than that the worker thread can give it.

Whatever reason there is for that.

This is a multiprocessor approach however in my opinion is the
multithreading approach not different from that in my opinion.

Cor
 
Hi Dick

I had a look at the system tray, and disabled Norton Antivirus Auto-protect,
and initially thought that I had sorted the problem, but on a long run I saw
the problem again.

You make a good point, though, about hibernation and power saving, since,
eventually, this laptop will be left unattended for half an hour whilst it
does its thing, and it would muck everything up if it went to sleep in the
middle of the show.

The laptop in question is an out-of-the-box Dell, with nothing much
installed apart from the framework and my application. I will have a closer
look though, just in case.

Thanks.

Charles
 
1) Run the application in release mode. Applications with debug symbol
baggage tend to run slower.
2) Boost the priority of the thread.
3) I'm not sure if this is redundant with #2 but you can try increasing the
application's priority. Via Task Manager, select the process and select "Set
Priority." I'm sure there's a way to do this via the framework or at least a
Win32 API as well.
4) Make sure the slowdown isn't caused by exceptions (even if handled in
your worker thread). Exceptions will halt the thread as the framework
processes it.
5) Similarly to #3... be aware that some methods may throw and handle their
own exceptions... all transparent to the caller. This will still slow
everything down even though you never get the exception. You might be able to
find the offending method by going to Debug|Exceptions and setting ALL
exceptions to 'Break into Debugger' regardless if the exception is handled.

In the end, after all is said and done, it might be due to garbage
collection. Garbage collection will pause your app momentarily. There are
tricks to minimize this and effective object management and pooling helps
(are you instantiating and releasing a lot of non-value type objects?).
 
Cor,

That paper describes an effort to parallelize one specific database
transaction using thread level speculation (TLS) techniques. TLS is a
mechanism for getting the most out of dual-core processors. TLS would
most likely be implemented by a compiler or JIT engine. The paper has
little useful information for the OP. However, it is interesting.

Brian
 
Hi,

Screen savers can be a problem, too. Some notebooks have a "built-in" one
that acts like Windows -- a little.

--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 4th
Edition ISBN 1-890422-28-2 (391 pages) published July 2004. See
www.mabry.com/vbpgser4 to order.
 
Charles Law said:
Hi Niki

Thanks for the response. I currently display the absolute and elapsed
times using a high res timer, which is how I can see the difference
between the expected and actual timings. I guess I could add more time
displays, but it would generate a lot of output, and whilst it might tell
me when and where in my code the excess time is taken, I am not sure it
will necessarily tell me why. I don't think it is my code (they all say
that), because it is so random.

Is it really random? How can you know without knowing where it happens? I
thought all you knew was that it doesn't happen every time?

Did you analyze the time delta's? How are they distributed? Is there a
normal distribution around 300 ms plus some peaks at 2,5 s? If so, did you
check where those peaks are? (I always dump output like that to a text file
and use Excel to analyze it later)
How can I tell when garbage collection occurs?

I have a German windows version, so I can't tell you the exact name of the
performance counter, should be something like "Number of GCs" in the ".NET
Memory" performance counters section. It increases by one with each GC
collection. Use perfmon to record it while you test your app.
How many are there? I thought there was just the one, built-in to the
framework.

MSDN Quote: "The CLR has two different GCs: Workstation (mscorwks.dll) and
Server (mscorsvr.dll). When running in Workstation mode, latency is more of
a concern than space or efficiency. A server with multiple processors and
clients connected over a network can afford some latency, but throughput is
now a top priority. Rather than shoehorn both of these scenarios into a
single GC scheme, Microsoft has included two garbage collectors that are
tailored to each situation."
[http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetperftechs.asp]

You'll have to host the runtime to choose between the two, but I think the
workstation version is the better choice for you anyway.

Niki
 
It's random in the sense that it doesn't always happen during the same
operation, and sometimes it doesn't happen at all.

The worker thread, repeatedly, does things like this:

DoTask1
DoTask2
DoTask7
DoTask2

All these tasks have to be performed within a couple of seconds. Nominally,
task1 takes 60 ms, task 2 takes 10 ms, and task 7 takes 550 ms, say.

The expectation, therefore, is that the whole process is over in less than 1
second. Sometimes, though, task 1 takes 2.5 seconds, for no reason I can
discern. Therefore, the entire sequence takes longer than the allowed time,
and the process fails. That said, it is not always task 1 that takes the
excess time; it could be task 2, or task 7, or task 43.

Each of these tasks will send and receive data on a serial port. I have
timed the serial comms carefully, and it consistently takes no more than 25
ms to send and receive data. Thus, I conclude that the time is spent
elsewhere.

Before each task is started, an event is raised to inform the UI that the
task is starting. The UI updates a rich text control using BeginInvoke. I am
wondering if these requests to update the UI pile up, for example, and
eventually get flushed, holding up the worker thread. The screen appears to
update steadily, but by definition the screen updates will not be
synchronised to the update requests, so some of them might result from the
clearing of a backlog.

When the UI displays the task being performed, I display the absolute time
and the delta based on a high res counter.

I will have a look at the performance counter you mention.

Charles


Niki Estner said:
Charles Law said:
Hi Niki

Thanks for the response. I currently display the absolute and elapsed
times using a high res timer, which is how I can see the difference
between the expected and actual timings. I guess I could add more time
displays, but it would generate a lot of output, and whilst it might tell
me when and where in my code the excess time is taken, I am not sure it
will necessarily tell me why. I don't think it is my code (they all say
that), because it is so random.

Is it really random? How can you know without knowing where it happens? I
thought all you knew was that it doesn't happen every time?

Did you analyze the time delta's? How are they distributed? Is there a
normal distribution around 300 ms plus some peaks at 2,5 s? If so, did you
check where those peaks are? (I always dump output like that to a text
file and use Excel to analyze it later)
How can I tell when garbage collection occurs?

I have a German windows version, so I can't tell you the exact name of the
performance counter, should be something like "Number of GCs" in the ".NET
Memory" performance counters section. It increases by one with each GC
collection. Use perfmon to record it while you test your app.
How many are there? I thought there was just the one, built-in to the
framework.

MSDN Quote: "The CLR has two different GCs: Workstation (mscorwks.dll) and
Server (mscorsvr.dll). When running in Workstation mode, latency is more
of a concern than space or efficiency. A server with multiple processors
and clients connected over a network can afford some latency, but
throughput is now a top priority. Rather than shoehorn both of these
scenarios into a single GC scheme, Microsoft has included two garbage
collectors that are tailored to each situation."
[http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetperftechs.asp]

You'll have to host the runtime to choose between the two, but I think the
workstation version is the better choice for you anyway.

Niki
 
Indeed. In this case, though, the screen saver has been set to none, and I
have observed the entire process and nothing obvious interrupts it.

Charles
 
1) Run the application in release mode. Applications with debug symbol
baggage tend to run slower.

It is currently running that way.
2) Boost the priority of the thread.

Currently set to AboveNormal.
3) I'm not sure if this is redundant with #2 but you can try increasing
the
application's priority. Via Task Manager, select the process and select
"Set
Priority." I'm sure there's a way to do this via the framework or at least
a
Win32 API as well.

Not tried, but I am inclined to agree with you assessment.
4) Make sure the slowdown isn't caused by exceptions (even if handled in
your worker thread). Exceptions will halt the thread as the framework
processes it.

Each task in the process is in a Try .. Catch block, and any exception would
appear on screen. In this case I do not get any exception reported.
5) Similarly to #3... be aware that some methods may throw and handle
their
own exceptions... all transparent to the caller. This will still slow
everything down even though you never get the exception. You might be able
to
find the offending method by going to Debug|Exceptions and setting ALL
exceptions to 'Break into Debugger' regardless if the exception is
handled.

The data in these tasks are pretty constant, so I would expect the
exceptions to be constant too, if there are any. The extra time does not
always appear in the same place (or at all), but the data do not change.
In the end, after all is said and done, it might be due to garbage
collection. Garbage collection will pause your app momentarily. There are
tricks to minimize this and effective object management and pooling helps
(are you instantiating and releasing a lot of non-value type objects?).

This is certainly one of my concerns, but I am unsure how I can restrict
garbage collection to non-time sensitive areas of my code. And yes, I am
creating a lot of non-value types; mostly associated with the raising of
events.

Charles
 
Hi Jay

A lot of good information, as usual. I shall mull it over and try some of
this out.
Can you use CLR profiler or custom performance counters to identify that a
Garbage Collection is not occurring during this "critical" process? Can
you redesign the critical process to minimize garbage collections if this
is the problem?

Is it possible to suspend garbage collection for critical sections?
Are you certain that windows did not suspend your program in favor of
allowing another program to run?

There is nothing obviously getting control; very little else is running. The
m/c is not networked, and apart from Norton Antivirus - which is disabled -
there is nothing much else.
Have you tried using Thread.Priority and/or Process.PriorityClass &
Process.PriorityBoostEnabled so Windows will favor your thread/process?

I haven't, but I will take a look.

Cheers.

Charles
 
1) Force a garbage collection when it won't interfere with your time
sensitive loop... using gc.Collect().
2) Use structures. These get allocated and deallocated on demand.
3) Reuse objects where possible rather than reinstantiating new ones.
Perhaps using some sort of object pool or something.

Hopefully someone here might have some more robust "hardcore" ways to help.
The following article might help too...
-
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetgcbasics.asp
- http://msdn.microsoft.com/msdnmag/issues/1200/GCI2/default.aspx
 
Back
Top