throttling iocp threads in async server

  • Thread starter Thread starter Marc Sherman
  • Start date Start date
M

Marc Sherman

Hello,

I'm designing an async socket server and I've read that the IOCP thread pool
can overwhelm the System.ThreadPool. I'm thinking about throttling the IOCP
threads by only allowing up to MAX connections to be accepted at any given
time where MAX is less than the number of threads in the System.ThreadPool.
So, for example, if the current number of accepted sockets is N (where N <
MAX), accepting a new connection increments N and closing an accepted socket
decrements N. If N == MAX, then no more new connections will be accepted
until N is decremented. While N == MAX, newly established TCP connections
simply queue up in the kernel waiting to be accepted.

Does this sound reasonable?

Thanks for any help,
Marc
 
Hello,

I'm designing an async socket server and I've read that the IOCP thread
pool
can overwhelm the System.ThreadPool. I'm thinking about throttling the
IOCP
threads by only allowing up to MAX connections to be accepted at any
given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind overwhelm
it. I don't know where you read what you read, but it's wrong.

Secondly, the implementation you're talking about is restricting the
number of connections, not the number of concurrent operations. While
it's true that doing so should also limit the number of concurrent
operations, it seems to me that if your goal is to limit the number of
concurrent operations, you should do _that_ instead of limiting the number
of connections arbitrarily.

Generally speaking, as long as you don't do much work in the same thread
where you handle a network event, the number of actual threads in use will
be relatively small. New threads will only be dequeue a new IOCP
completion if a currently running thread isn't able to. The only reason
it wouldn't be able to is that it got preempted before it could get back
to dequeue a completion, generally because your own code took too long to
return from whatever processing you're doing on the data.

You should forget about trying to implement these kinds of optimizations
at the outset. As long as you minimize the amount of work you do in the
actual i/o callback, your server should be able to handle a high i/o
volume without any trouble and without breaking either thread pool (the
main one or the IOCP pool).

If you can eventually reproduce unacceptable performance issues, _then_
you can look at what's actually causing those issues and try to fix them.
But it's a waste to try to do that prior to having problems to look at,
and in fact you can easily _cause_ problems by trying to optimize
something that doesn't need optimizing.

Pete
 
Peter Duniho said:
Hello,

I'm designing an async socket server and I've read that the IOCP thread
pool
can overwhelm the System.ThreadPool. I'm thinking about throttling the
IOCP
threads by only allowing up to MAX connections to be accepted at any
given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind overwhelm
it. I don't know where you read what you read, but it's wrong.

I should have mentioned that from the IOCP thread I'll be calling 3rd party
code that does use the System.ThreadPool to support it's own async
operations. The article I read is
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/8/Default.aspx.

Basically, our server will have the same architecture as "SoapBox Server,
Stage 1" as described in
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/10/Default.aspx
with the difference being that we'll be using the System.Threapool, not a
custom threadpool as the author shows.

In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool which is
something I don't want to do at this point.

Marc
 
The author of that blog article is very smart, and really knows his stuff.
I'm sure he's right! :) (Yes, that's a feeble attempt at humor...)

If you take lots and lots of operations off the IOCP TP and dump them into a
3rd party app that uses a naieve async infrastructure (such as the System
ThreadPool), then you're application is going to grind to a screeching
hault. The Threadpool is pretty big these days (250 threads per processor
core, by default), but you really don't want that many threads running at
once.

If you start blocking IOCP threads, you're going to cause all sorts of
problems - you'll fill up your Windows Socket buffers, and the IOCP thread
pool will have stutter problems, etc. You really want to avoid this if you
can.

Your best bet is likley to queue things up as they come off the IOCP thread
pool. Then have a single dedicated thread for taking items from your queue
and handing them to the library. In this thread you can track a counter "how
many items processing right now?", and if you're at or near your limit, then
don't don't dequeue the item from your queue quite yet. The drawback to this
is that you'll have at least 2 context switches per operation - once onto
your "worker" thread, and another into the actual Thread Pool thread the
library uses. If this becomes a problem, you could (later, after it all
works - don't do premature optimization!) probably have your IOCP thread
check the limit counter, and directly post the item to the 3rd party
library, thereby cutting out one of the context switches. This isn't a huge
penalty to pay, although minimizing this in the general case would be nice.

(Be sure to look at some of the Parallel Datastructures that Joe Duffy wrote
about in his MSDN article a few months back. Several of those were related
to multi-threaded queue management...)

Just try really hard to avoid stalling the IOCP threads on a regular basis
and/or for long periods of time, as this will cause poor general behavior.

--
Chris Mullins

Marc Sherman said:
Peter Duniho said:
Hello,

I'm designing an async socket server and I've read that the IOCP thread
pool
can overwhelm the System.ThreadPool. I'm thinking about throttling the
IOCP
threads by only allowing up to MAX connections to be accepted at any
given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind overwhelm
it. I don't know where you read what you read, but it's wrong.

I should have mentioned that from the IOCP thread I'll be calling 3rd
party code that does use the System.ThreadPool to support it's own async
operations. The article I read is
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/8/Default.aspx.

Basically, our server will have the same architecture as "SoapBox Server,
Stage 1" as described in
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/10/Default.aspx
with the difference being that we'll be using the System.Threapool, not a
custom threadpool as the author shows.

In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool which
is something I don't want to do at this point.

Marc
 
I should have mentioned that from the IOCP thread I'll be calling 3rd
party
code that does use the System.ThreadPool to support it's own async
operations.

Yup. You might have mentioned that. :) Your original post wasn't what
I'd call a fair characterization of what was written in your referenced
article.

Yes, that makes a lot more sense. And you'll note that the article (and
Chris's follow-up) provides basically the same advice I offered earlier:
don't do blocking things in your async i/o callbacks (which are executing
on an IOCP thread).
[...]
In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool
which is
something I don't want to do at this point.

Well, to some extent you're going to have to do _something_ to organize
your processing. Chris has provided useful suggestions as to alternative
ways to accomplish this. I would say that given that the 3rd party
library is already using the ThreadPool that adding yet another thread
pool implementation as a wrapper around that probably doesn't make sense.
It's hard to say for sure without knowing specifics about the architecture
of the 3rd party library, but I'd say that the "queue it up" solution that
Chris has suggested may be the cleanest, and possibly even the most
performant, solution you'd be able to achieve.

Pete
 
Chris,

Thanks for the design suggestion of tracking a counter from a dedicated
thread. Since the System.ThreadPool is used by more than just the 3rd party
lib, should the limit be set to some percentage of the max num of
System.ThreadPool threads? That way, the 3rd party lib I'm using won't hog
all of them. If so, does 50% sound reasonable?

A question about ThreadPool.GetMaxThreads: It returns two counts,
workerThreads and completionPortThreads. The second count,
completionPortThreads, that's not referring to the count of the IOCP thread
pool, is it?

If not, should the limit be based on the sum of workerThreads +
completionPortThreads or just workerThreads?

Thanks for your help and your blog articles.

Marc

Chris Mullins said:
The author of that blog article is very smart, and really knows his stuff.
I'm sure he's right! :) (Yes, that's a feeble attempt at humor...)

If you take lots and lots of operations off the IOCP TP and dump them into
a 3rd party app that uses a naieve async infrastructure (such as the
System ThreadPool), then you're application is going to grind to a
screeching hault. The Threadpool is pretty big these days (250 threads per
processor core, by default), but you really don't want that many threads
running at once.

If you start blocking IOCP threads, you're going to cause all sorts of
problems - you'll fill up your Windows Socket buffers, and the IOCP thread
pool will have stutter problems, etc. You really want to avoid this if you
can.

Your best bet is likley to queue things up as they come off the IOCP
thread pool. Then have a single dedicated thread for taking items from
your queue and handing them to the library. In this thread you can track a
counter "how many items processing right now?", and if you're at or near
your limit, then don't don't dequeue the item from your queue quite yet.
The drawback to this is that you'll have at least 2 context switches per
operation - once onto your "worker" thread, and another into the actual
Thread Pool thread the library uses. If this becomes a problem, you could
(later, after it all works - don't do premature optimization!) probably
have your IOCP thread check the limit counter, and directly post the item
to the 3rd party library, thereby cutting out one of the context switches.
This isn't a huge penalty to pay, although minimizing this in the general
case would be nice.

(Be sure to look at some of the Parallel Datastructures that Joe Duffy
wrote about in his MSDN article a few months back. Several of those were
related to multi-threaded queue management...)

Just try really hard to avoid stalling the IOCP threads on a regular basis
and/or for long periods of time, as this will cause poor general behavior.

--
Chris Mullins

Marc Sherman said:
Peter Duniho said:
On Thu, 17 Jan 2008 13:50:44 -0800, Marc Sherman

Hello,

I'm designing an async socket server and I've read that the IOCP thread
pool
can overwhelm the System.ThreadPool. I'm thinking about throttling the
IOCP
threads by only allowing up to MAX connections to be accepted at any
given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind overwhelm
it. I don't know where you read what you read, but it's wrong.

I should have mentioned that from the IOCP thread I'll be calling 3rd
party code that does use the System.ThreadPool to support it's own async
operations. The article I read is
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/8/Default.aspx.

Basically, our server will have the same architecture as "SoapBox Server,
Stage 1" as described in
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/10/Default.aspx
with the difference being that we'll be using the System.Threapool, not a
custom threadpool as the author shows.

In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool which
is something I don't want to do at this point.

Marc
 
After spending years trying to be fancy and calculating how many items I
should be doing in parallel, I take a simpler approach now.

I use a value, that's defined in my app.config.
"SimultaniousProcessingLimit". I generally set this to a value of 10. I have
no good reason for this, and can't defend the number at all, but that's
where I start. If an installation needs more parallelism, then I bump the
number. If it's a small install, I leave it at 10.

Getting fancy (or worse, adaptive!), seems to cause more problems than it
solves. Just make the value easily configurable.

.... besides, when you're debugging, it's really nice to be able to set the
value to "1" or "2". This makes everything easier, and also helps you write
edge case tests.

--
Chris Mullins


Marc Sherman said:
Chris,

Thanks for the design suggestion of tracking a counter from a dedicated
thread. Since the System.ThreadPool is used by more than just the 3rd
party lib, should the limit be set to some percentage of the max num of
System.ThreadPool threads? That way, the 3rd party lib I'm using won't hog
all of them. If so, does 50% sound reasonable?

A question about ThreadPool.GetMaxThreads: It returns two counts,
workerThreads and completionPortThreads. The second count,
completionPortThreads, that's not referring to the count of the IOCP
thread pool, is it?

If not, should the limit be based on the sum of workerThreads +
completionPortThreads or just workerThreads?

Thanks for your help and your blog articles.

Marc

Chris Mullins said:
The author of that blog article is very smart, and really knows his
stuff. I'm sure he's right! :) (Yes, that's a feeble attempt at humor...)

If you take lots and lots of operations off the IOCP TP and dump them
into a 3rd party app that uses a naieve async infrastructure (such as the
System ThreadPool), then you're application is going to grind to a
screeching hault. The Threadpool is pretty big these days (250 threads
per processor core, by default), but you really don't want that many
threads running at once.

If you start blocking IOCP threads, you're going to cause all sorts of
problems - you'll fill up your Windows Socket buffers, and the IOCP
thread pool will have stutter problems, etc. You really want to avoid
this if you can.

Your best bet is likley to queue things up as they come off the IOCP
thread pool. Then have a single dedicated thread for taking items from
your queue and handing them to the library. In this thread you can track
a counter "how many items processing right now?", and if you're at or
near your limit, then don't don't dequeue the item from your queue quite
yet. The drawback to this is that you'll have at least 2 context switches
per operation - once onto your "worker" thread, and another into the
actual Thread Pool thread the library uses. If this becomes a problem,
you could (later, after it all works - don't do premature optimization!)
probably have your IOCP thread check the limit counter, and directly post
the item to the 3rd party library, thereby cutting out one of the context
switches. This isn't a huge penalty to pay, although minimizing this in
the general case would be nice.

(Be sure to look at some of the Parallel Datastructures that Joe Duffy
wrote about in his MSDN article a few months back. Several of those were
related to multi-threaded queue management...)

Just try really hard to avoid stalling the IOCP threads on a regular
basis and/or for long periods of time, as this will cause poor general
behavior.

--
Chris Mullins

Marc Sherman said:
On Thu, 17 Jan 2008 13:50:44 -0800, Marc Sherman

Hello,

I'm designing an async socket server and I've read that the IOCP
thread pool
can overwhelm the System.ThreadPool. I'm thinking about throttling the
IOCP
threads by only allowing up to MAX connections to be accepted at any
given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind
overwhelm it. I don't know where you read what you read, but it's
wrong.

I should have mentioned that from the IOCP thread I'll be calling 3rd
party code that does use the System.ThreadPool to support it's own async
operations. The article I read is
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/8/Default.aspx.

Basically, our server will have the same architecture as "SoapBox
Server, Stage 1" as described in
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/10/Default.aspx
with the difference being that we'll be using the System.Threapool, not
a custom threadpool as the author shows.

In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool
which is something I don't want to do at this point.

Marc
 
Peter Duniho said:
Yup. You might have mentioned that. :) Your original post wasn't what
I'd call a fair characterization of what was written in your referenced
article.

I agree. Sorry about that.
Yes, that makes a lot more sense. And you'll note that the article (and
Chris's follow-up) provides basically the same advice I offered earlier:
don't do blocking things in your async i/o callbacks (which are executing
on an IOCP thread).

Just out of curiosity, what if I had a dedicated accept thread and only did
reads and writes asynchronously? The dedicated accept thread would limit the
number of accepted connections and it would block when that limit is reached
(the async i/o callbacks would *never* block). This would effectlively
leverage the pending connections queue in the kernel. I know you said
connections != operations but I didn't quite follow. Could you elaborate on
that?
[...]
In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool
which is
something I don't want to do at this point.

Well, to some extent you're going to have to do _something_ to organize
your processing. Chris has provided useful suggestions as to alternative
ways to accomplish this. I would say that given that the 3rd party
library is already using the ThreadPool that adding yet another thread
pool implementation as a wrapper around that probably doesn't make sense.
It's hard to say for sure without knowing specifics about the architecture
of the 3rd party library, but I'd say that the "queue it up" solution that
Chris has suggested may be the cleanest, and possibly even the most
performant, solution you'd be able to achieve.

I think I'll be going that route.

BTW, I'm using the TcpListener and NetworkStream classes instead of the
Socket class. I'm assuming the callbacks for these classes also occur on
IOCP threads and *not* on System.ThreadPool threads. Is that correct?

Thanks for your help,
Marc
 
Sounds good to me, thanks.

Still curious what the 2nd count, completionPortThreads, returned by
ThreadPool.GetMaxThreads refers to. From what I've read I don't think it
referes to the IOCP thread pool that we've been discussing (the one that
handles async socket callbacks). But does System.ThreadPool have its own
internal IOCP and this is what completionPortThreads refers to?

thanks,
Marc

Chris Mullins said:
After spending years trying to be fancy and calculating how many items I
should be doing in parallel, I take a simpler approach now.

I use a value, that's defined in my app.config.
"SimultaniousProcessingLimit". I generally set this to a value of 10. I
have no good reason for this, and can't defend the number at all, but
that's where I start. If an installation needs more parallelism, then I
bump the number. If it's a small install, I leave it at 10.

Getting fancy (or worse, adaptive!), seems to cause more problems than it
solves. Just make the value easily configurable.

... besides, when you're debugging, it's really nice to be able to set the
value to "1" or "2". This makes everything easier, and also helps you
write edge case tests.

--
Chris Mullins


Marc Sherman said:
Chris,

Thanks for the design suggestion of tracking a counter from a dedicated
thread. Since the System.ThreadPool is used by more than just the 3rd
party lib, should the limit be set to some percentage of the max num of
System.ThreadPool threads? That way, the 3rd party lib I'm using won't
hog all of them. If so, does 50% sound reasonable?

A question about ThreadPool.GetMaxThreads: It returns two counts,
workerThreads and completionPortThreads. The second count,
completionPortThreads, that's not referring to the count of the IOCP
thread pool, is it?

If not, should the limit be based on the sum of workerThreads +
completionPortThreads or just workerThreads?

Thanks for your help and your blog articles.

Marc

Chris Mullins said:
The author of that blog article is very smart, and really knows his
stuff. I'm sure he's right! :) (Yes, that's a feeble attempt at
humor...)

If you take lots and lots of operations off the IOCP TP and dump them
into a 3rd party app that uses a naieve async infrastructure (such as
the System ThreadPool), then you're application is going to grind to a
screeching hault. The Threadpool is pretty big these days (250 threads
per processor core, by default), but you really don't want that many
threads running at once.

If you start blocking IOCP threads, you're going to cause all sorts of
problems - you'll fill up your Windows Socket buffers, and the IOCP
thread pool will have stutter problems, etc. You really want to avoid
this if you can.

Your best bet is likley to queue things up as they come off the IOCP
thread pool. Then have a single dedicated thread for taking items from
your queue and handing them to the library. In this thread you can track
a counter "how many items processing right now?", and if you're at or
near your limit, then don't don't dequeue the item from your queue quite
yet. The drawback to this is that you'll have at least 2 context
switches per operation - once onto your "worker" thread, and another
into the actual Thread Pool thread the library uses. If this becomes a
problem, you could (later, after it all works - don't do premature
optimization!) probably have your IOCP thread check the limit counter,
and directly post the item to the 3rd party library, thereby cutting out
one of the context switches. This isn't a huge penalty to pay, although
minimizing this in the general case would be nice.

(Be sure to look at some of the Parallel Datastructures that Joe Duffy
wrote about in his MSDN article a few months back. Several of those were
related to multi-threaded queue management...)

Just try really hard to avoid stalling the IOCP threads on a regular
basis and/or for long periods of time, as this will cause poor general
behavior.

--
Chris Mullins

On Thu, 17 Jan 2008 13:50:44 -0800, Marc Sherman

Hello,

I'm designing an async socket server and I've read that the IOCP
thread pool
can overwhelm the System.ThreadPool. I'm thinking about throttling
the IOCP
threads by only allowing up to MAX connections to be accepted at any
given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind
overwhelm it. I don't know where you read what you read, but it's
wrong.

I should have mentioned that from the IOCP thread I'll be calling 3rd
party code that does use the System.ThreadPool to support it's own
async operations. The article I read is
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/8/Default.aspx.

Basically, our server will have the same architecture as "SoapBox
Server, Stage 1" as described in
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/10/Default.aspx
with the difference being that we'll be using the System.Threapool, not
a custom threadpool as the author shows.

In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool
which is something I don't want to do at this point.

Marc
 
That value is the count of threads in the IOCP thread pool.

These are the same threads that handle socket callbacks, file I/O callbacks,
Async SQL callbacks, etc.

I've always found the number to be pretty useless though - it reports "1000"
just about all the time. Now that the source code is available to poke
through, perhaps I'll go take a look...

(BTW: On a different note, if you need a custom thread pool, Jeff Richter's
Power Threading library has a nice implementation of a thread pool. This
isn't something you would need to write yourself).

--
Chris Mullins


Marc Sherman said:
Sounds good to me, thanks.

Still curious what the 2nd count, completionPortThreads, returned by
ThreadPool.GetMaxThreads refers to. From what I've read I don't think it
referes to the IOCP thread pool that we've been discussing (the one that
handles async socket callbacks). But does System.ThreadPool have its own
internal IOCP and this is what completionPortThreads refers to?

thanks,
Marc

Chris Mullins said:
After spending years trying to be fancy and calculating how many items I
should be doing in parallel, I take a simpler approach now.

I use a value, that's defined in my app.config.
"SimultaniousProcessingLimit". I generally set this to a value of 10. I
have no good reason for this, and can't defend the number at all, but
that's where I start. If an installation needs more parallelism, then I
bump the number. If it's a small install, I leave it at 10.

Getting fancy (or worse, adaptive!), seems to cause more problems than it
solves. Just make the value easily configurable.

... besides, when you're debugging, it's really nice to be able to set
the value to "1" or "2". This makes everything easier, and also helps you
write edge case tests.

--
Chris Mullins


Marc Sherman said:
Chris,

Thanks for the design suggestion of tracking a counter from a dedicated
thread. Since the System.ThreadPool is used by more than just the 3rd
party lib, should the limit be set to some percentage of the max num of
System.ThreadPool threads? That way, the 3rd party lib I'm using won't
hog all of them. If so, does 50% sound reasonable?

A question about ThreadPool.GetMaxThreads: It returns two counts,
workerThreads and completionPortThreads. The second count,
completionPortThreads, that's not referring to the count of the IOCP
thread pool, is it?

If not, should the limit be based on the sum of workerThreads +
completionPortThreads or just workerThreads?

Thanks for your help and your blog articles.

Marc

The author of that blog article is very smart, and really knows his
stuff. I'm sure he's right! :) (Yes, that's a feeble attempt at
humor...)

If you take lots and lots of operations off the IOCP TP and dump them
into a 3rd party app that uses a naieve async infrastructure (such as
the System ThreadPool), then you're application is going to grind to a
screeching hault. The Threadpool is pretty big these days (250 threads
per processor core, by default), but you really don't want that many
threads running at once.

If you start blocking IOCP threads, you're going to cause all sorts of
problems - you'll fill up your Windows Socket buffers, and the IOCP
thread pool will have stutter problems, etc. You really want to avoid
this if you can.

Your best bet is likley to queue things up as they come off the IOCP
thread pool. Then have a single dedicated thread for taking items from
your queue and handing them to the library. In this thread you can
track a counter "how many items processing right now?", and if you're
at or near your limit, then don't don't dequeue the item from your
queue quite yet. The drawback to this is that you'll have at least 2
context switches per operation - once onto your "worker" thread, and
another into the actual Thread Pool thread the library uses. If this
becomes a problem, you could (later, after it all works - don't do
premature optimization!) probably have your IOCP thread check the limit
counter, and directly post the item to the 3rd party library, thereby
cutting out one of the context switches. This isn't a huge penalty to
pay, although minimizing this in the general case would be nice.

(Be sure to look at some of the Parallel Datastructures that Joe Duffy
wrote about in his MSDN article a few months back. Several of those
were related to multi-threaded queue management...)

Just try really hard to avoid stalling the IOCP threads on a regular
basis and/or for long periods of time, as this will cause poor general
behavior.

--
Chris Mullins

On Thu, 17 Jan 2008 13:50:44 -0800, Marc Sherman

Hello,

I'm designing an async socket server and I've read that the IOCP
thread pool
can overwhelm the System.ThreadPool. I'm thinking about throttling
the IOCP
threads by only allowing up to MAX connections to be accepted at any
given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind
overwhelm it. I don't know where you read what you read, but it's
wrong.

I should have mentioned that from the IOCP thread I'll be calling 3rd
party code that does use the System.ThreadPool to support it's own
async operations. The article I read is
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/8/Default.aspx.

Basically, our server will have the same architecture as "SoapBox
Server, Stage 1" as described in
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/10/Default.aspx
with the difference being that we'll be using the System.Threapool,
not a custom threadpool as the author shows.

In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool
which is something I don't want to do at this point.

Marc
 
That value is the count of threads in the IOCP thread pool.

These are the same threads that handle socket callbacks, file I/O
callbacks,
Async SQL callbacks, etc.

I've always found the number to be pretty useless though - it reports
"1000"
just about all the time. Now that the source code is available to poke
through, perhaps I'll go take a look...

Well, it's the "max" count, not the current count. And last I read, 1000
threads was actually the max # of IOCP threads.

So I would say that the number is valid, even if not a practical
indication of how one should be using IOCP. :)

Pete
 
I agree. Sorry about that.

No problem. The only person it hurts is the person trying to get
answers. :)
Just out of curiosity, what if I had a dedicated accept thread and only
did
reads and writes asynchronously? The dedicated accept thread would limit
the
number of accepted connections and it would block when that limit is
reached
(the async i/o callbacks would *never* block). This would effectlively
leverage the pending connections queue in the kernel. I know you said
connections != operations but I didn't quite follow. Could you elaborate
on
that?

The "connections != operations" idea has to do with the original
implementation you suggested. It also tends to apply to this recent
question too. Let's assume that for any given TCP connection, you're only
going to have one outstanding network operation at a time. IOCP actually
allows overlapped i/o, but this complicates the design of one's code andI
think it most cases the need for doing something that like is overrated.
So let's keep it simple.

If you assume that you only have one outstanding network operation at a
time, then limiting the number of connections will indeed limit the number
of operations. But it's overkill. You could have 10,000 connections (for
example) but still only have some smaller number of simultaneous
operations. And statistically, it's even likely that there's a number you
can put on this ratio. Let's say it's 50%.

So, assuming at any given moment, only 50% of your connections have
outstanding operations, then you are wasting 50% of your potential
bandwidth. You can't just double the number of connections, because at
_peak_ you _could_ have 100% utilization on your connections. So using a
connection-limited model you do need to keep the connection limit the same
as the operation limit. But most of the time this will unnecessarily
restrict your ability to service client connections.

If, on the other hand, you come up with a mechanism for queuing operations
generally, rather than the heavy-handed "let's restrict the number of
connections", then you can allow arbitrarily many connections to exist and
as long as they aren't all hammering your server, they all get maximum
throughput. Once you reach your maximum number of operations, obviously
things will be throttled at that point, but that would be true in any case.

Of course, ideally you'd configure things so that your bottleneck is the
network itself. But that's the ideal, not always practical in the real
world. (Again, this depends on factors about your server that I don't
know about, so I can't really comment specifically).

So, sure...you can limit connections in the way you're talking about.
This would be essentially the same as the first proposal you posted, just
using a different mechanism for dealing with it. And it would have the
same problem of artificially restricting your throughput based on client
utilization.
[...]
Chris has suggested may be the cleanest, and possibly even the most
performant, solution you'd be able to achieve.

I think I'll be going that route.

BTW, I'm using the TcpListener and NetworkStream classes instead of the
Socket class. I'm assuming the callbacks for these classes also occur on
IOCP threads and *not* on System.ThreadPool threads. Is that correct?

I have not double-checked that myself, but as far as I know all i/o
classes in .NET use IOCP when possible. And the TcpListener and
NetworkStream classes in particular are probably using the Socket
implementation behind the scenes anyway. So, yes...I think it's safe to
assume that IOCP is used even when you are using TcpListener and
NetworkStream.

Pete
 
Chris, Thanks for your help.

Marc

Chris Mullins said:
That value is the count of threads in the IOCP thread pool.

These are the same threads that handle socket callbacks, file I/O
callbacks, Async SQL callbacks, etc.

I've always found the number to be pretty useless though - it reports
"1000" just about all the time. Now that the source code is available to
poke through, perhaps I'll go take a look...

(BTW: On a different note, if you need a custom thread pool, Jeff
Richter's Power Threading library has a nice implementation of a thread
pool. This isn't something you would need to write yourself).

--
Chris Mullins


Marc Sherman said:
Sounds good to me, thanks.

Still curious what the 2nd count, completionPortThreads, returned by
ThreadPool.GetMaxThreads refers to. From what I've read I don't think it
referes to the IOCP thread pool that we've been discussing (the one that
handles async socket callbacks). But does System.ThreadPool have its own
internal IOCP and this is what completionPortThreads refers to?

thanks,
Marc

Chris Mullins said:
After spending years trying to be fancy and calculating how many items I
should be doing in parallel, I take a simpler approach now.

I use a value, that's defined in my app.config.
"SimultaniousProcessingLimit". I generally set this to a value of 10. I
have no good reason for this, and can't defend the number at all, but
that's where I start. If an installation needs more parallelism, then I
bump the number. If it's a small install, I leave it at 10.

Getting fancy (or worse, adaptive!), seems to cause more problems than
it solves. Just make the value easily configurable.

... besides, when you're debugging, it's really nice to be able to set
the value to "1" or "2". This makes everything easier, and also helps
you write edge case tests.

--
Chris Mullins


Chris,

Thanks for the design suggestion of tracking a counter from a dedicated
thread. Since the System.ThreadPool is used by more than just the 3rd
party lib, should the limit be set to some percentage of the max num of
System.ThreadPool threads? That way, the 3rd party lib I'm using won't
hog all of them. If so, does 50% sound reasonable?

A question about ThreadPool.GetMaxThreads: It returns two counts,
workerThreads and completionPortThreads. The second count,
completionPortThreads, that's not referring to the count of the IOCP
thread pool, is it?

If not, should the limit be based on the sum of workerThreads +
completionPortThreads or just workerThreads?

Thanks for your help and your blog articles.

Marc

The author of that blog article is very smart, and really knows his
stuff. I'm sure he's right! :) (Yes, that's a feeble attempt at
humor...)

If you take lots and lots of operations off the IOCP TP and dump them
into a 3rd party app that uses a naieve async infrastructure (such as
the System ThreadPool), then you're application is going to grind to a
screeching hault. The Threadpool is pretty big these days (250 threads
per processor core, by default), but you really don't want that many
threads running at once.

If you start blocking IOCP threads, you're going to cause all sorts of
problems - you'll fill up your Windows Socket buffers, and the IOCP
thread pool will have stutter problems, etc. You really want to avoid
this if you can.

Your best bet is likley to queue things up as they come off the IOCP
thread pool. Then have a single dedicated thread for taking items from
your queue and handing them to the library. In this thread you can
track a counter "how many items processing right now?", and if you're
at or near your limit, then don't don't dequeue the item from your
queue quite yet. The drawback to this is that you'll have at least 2
context switches per operation - once onto your "worker" thread, and
another into the actual Thread Pool thread the library uses. If this
becomes a problem, you could (later, after it all works - don't do
premature optimization!) probably have your IOCP thread check the
limit counter, and directly post the item to the 3rd party library,
thereby cutting out one of the context switches. This isn't a huge
penalty to pay, although minimizing this in the general case would be
nice.

(Be sure to look at some of the Parallel Datastructures that Joe Duffy
wrote about in his MSDN article a few months back. Several of those
were related to multi-threaded queue management...)

Just try really hard to avoid stalling the IOCP threads on a regular
basis and/or for long periods of time, as this will cause poor general
behavior.

--
Chris Mullins

On Thu, 17 Jan 2008 13:50:44 -0800, Marc Sherman

Hello,

I'm designing an async socket server and I've read that the IOCP
thread pool
can overwhelm the System.ThreadPool. I'm thinking about throttling
the IOCP
threads by only allowing up to MAX connections to be accepted at
any given
time where MAX is less than the number of threads in the
System.ThreadPool.
[...]

Does this sound reasonable?

No.

First of all, there is a separate IOCP thread pool. IOCP operations
aren't going to use the regular thread pool at all, never mind
overwhelm it. I don't know where you read what you read, but it's
wrong.

I should have mentioned that from the IOCP thread I'll be calling 3rd
party code that does use the System.ThreadPool to support it's own
async operations. The article I read is
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/8/Default.aspx.

Basically, our server will have the same architecture as "SoapBox
Server, Stage 1" as described in
http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/10/Default.aspx
with the difference being that we'll be using the System.Threapool,
not a custom threadpool as the author shows.

In the first article, the author mentions that their solution to
overwhelming the System.ThreadPool was to roll their own threadpool
which is something I don't want to do at this point.

Marc
 
Peter, thanks for the explanation and for your help.

Marc

I agree. Sorry about that.

No problem. The only person it hurts is the person trying to get
answers. :)
Just out of curiosity, what if I had a dedicated accept thread and only
did
reads and writes asynchronously? The dedicated accept thread would limit
the
number of accepted connections and it would block when that limit is
reached
(the async i/o callbacks would *never* block). This would effectlively
leverage the pending connections queue in the kernel. I know you said
connections != operations but I didn't quite follow. Could you elaborate
on
that?

The "connections != operations" idea has to do with the original
implementation you suggested. It also tends to apply to this recent
question too. Let's assume that for any given TCP connection, you're only
going to have one outstanding network operation at a time. IOCP actually
allows overlapped i/o, but this complicates the design of one's code and I
think it most cases the need for doing something that like is overrated.
So let's keep it simple.

If you assume that you only have one outstanding network operation at a
time, then limiting the number of connections will indeed limit the number
of operations. But it's overkill. You could have 10,000 connections (for
example) but still only have some smaller number of simultaneous
operations. And statistically, it's even likely that there's a number you
can put on this ratio. Let's say it's 50%.

So, assuming at any given moment, only 50% of your connections have
outstanding operations, then you are wasting 50% of your potential
bandwidth. You can't just double the number of connections, because at
_peak_ you _could_ have 100% utilization on your connections. So using a
connection-limited model you do need to keep the connection limit the same
as the operation limit. But most of the time this will unnecessarily
restrict your ability to service client connections.

If, on the other hand, you come up with a mechanism for queuing operations
generally, rather than the heavy-handed "let's restrict the number of
connections", then you can allow arbitrarily many connections to exist and
as long as they aren't all hammering your server, they all get maximum
throughput. Once you reach your maximum number of operations, obviously
things will be throttled at that point, but that would be true in any case.

Of course, ideally you'd configure things so that your bottleneck is the
network itself. But that's the ideal, not always practical in the real
world. (Again, this depends on factors about your server that I don't
know about, so I can't really comment specifically).

So, sure...you can limit connections in the way you're talking about.
This would be essentially the same as the first proposal you posted, just
using a different mechanism for dealing with it. And it would have the
same problem of artificially restricting your throughput based on client
utilization.
[...]
Chris has suggested may be the cleanest, and possibly even the most
performant, solution you'd be able to achieve.

I think I'll be going that route.

BTW, I'm using the TcpListener and NetworkStream classes instead of the
Socket class. I'm assuming the callbacks for these classes also occur on
IOCP threads and *not* on System.ThreadPool threads. Is that correct?

I have not double-checked that myself, but as far as I know all i/o
classes in .NET use IOCP when possible. And the TcpListener and
NetworkStream classes in particular are probably using the Socket
implementation behind the scenes anyway. So, yes...I think it's safe to
assume that IOCP is used even when you are using TcpListener and
NetworkStream.

Pete
 
Back
Top