ASP requests and locking

  • Thread starter Thread starter polastine
  • Start date Start date
P

polastine

Hi all -

I'm relatively new to ASP.NET, coming over from the PHP dark side =) So
far I'm enjoying working with all this very much, especially the
templating system, which is light years beyond anything available with
PHP or Perl.

Skip the next few paragraphs if you're not interested in the back story =)

I'm currently writing a web app that is designed to handle a very heavy
traffic load - about 3 million requests per day, possibly more. That's
about 85 requests per second, give or take.

The app is designed to scale out (cluster) on Server 2008. My
characterization tests seem to indicate that I can handle about 40 reqs
per second on the servers we use, so I'm going to need two or three
boxes for this (obviously not 40 reqs per second every hour of every
day, that's just the potential peak load profile).

The requests are stateless, sessions are not turned on and what little
information I need about the user is stored in a cookie, so there are no
server affinity problems. The database is relatively simple, heavily
denormalized and performance is excellent. We also have a fairly
comprehensive caching system for repeated coherent reads that works a
bit like memcached. All this is working fine right now.

The new wrinkle is as follows. There's a need to have a set of
configuration values maintained in each of the webheads. Basically this
is just a large associative array, in the form Dictionary<string,
string> or whatever. The data this thing contains comes from the
database. I have the caching and everything else figured out across the
cluster, so that's not the problem.

But I'm finding the ASP.NET request model and how it uses threads a bit
confusing so I'm wondering if someone can go through the questions below
and answer them:

- ASP.NET uses one thread per HTTP request, correct?
- When using the lock(x) C# idiom, that lock is app domain-wide. So if
I'm locking to get at the value of a hashtable in the ASP cache, I'm
locking all other requests for this particular app pool until the lock
block exits. Correct?
- To follow the above, I assume I do have to lock anything that might be
accessed at the same time from two or more concurrent requests?
- When accessing properties of a class that contains only static
property accessors, do I need to lock? Assume the class is not accessing
anything that is shared internally, just other static variables.
- Assume the class is accessing a hashtable instance declared as a
private static variable, but still from a static method. Is the lock
necessary? i.e.:

public static class Foo {
private static Hashtable _ht;
...
...
public static int SomeProperty {
get {
// Would not locking here cause a contention problem?
return (int) _ht["SomeProperty"];
}
}
}

It's probably clear by now that I'm trying to figure out how to best
store and access these key/value pairs, from a performance perspective.
It's also probably clear that I can't afford a lock at all. I also
looked at HttpRequest.Items, however that does not help me because I'd
have to create (or clone) an instance of the config class for every
request, or keep a pool or something complicated like that.

A final question about locking and the ASP cache. If I have a cache item
which is a large string (maybe some prebuilt markup for a page that
rarely changes), and I access that string from an HTTP request to push
it out into the response stream, is the process of pulling that string
from the cache a locking situation as well? I.e., does the ASP cache
have to lock a primitive type in order to return it? And will this lock
affect concurrency at all in heavy load situations?

I realize it's a lot of questions, but even just answering one will help
me a lot and maybe set me off on the right path. I'm not having a lot of
luck with Google on this. Or if someone has solved this problem before
and is willing to share, that would be even better =)

Thanks a lot in advance!
 
polastine said:
Hi all -

I'm relatively new to ASP.NET, coming over from the PHP dark side =) So
far I'm enjoying working with all this very much, especially the
templating system, which is light years beyond anything available with
PHP or Perl.

Skip the next few paragraphs if you're not interested in the back story =)

I'm currently writing a web app that is designed to handle a very heavy
traffic load - about 3 million requests per day, possibly more. That's
about 85 requests per second, give or take.

The app is designed to scale out (cluster) on Server 2008. My
characterization tests seem to indicate that I can handle about 40 reqs
per second on the servers we use, so I'm going to need two or three
boxes for this (obviously not 40 reqs per second every hour of every
day, that's just the potential peak load profile).

The requests are stateless, sessions are not turned on and what little
information I need about the user is stored in a cookie, so there are no
server affinity problems. The database is relatively simple, heavily
denormalized and performance is excellent. We also have a fairly
comprehensive caching system for repeated coherent reads that works a
bit like memcached. All this is working fine right now.

The new wrinkle is as follows. There's a need to have a set of
configuration values maintained in each of the webheads. Basically this
is just a large associative array, in the form Dictionary<string,
string> or whatever. The data this thing contains comes from the
database. I have the caching and everything else figured out across the
cluster, so that's not the problem.

But I'm finding the ASP.NET request model and how it uses threads a bit
confusing so I'm wondering if someone can go through the questions below
and answer them:

- ASP.NET uses one thread per HTTP request, correct?

Yes. (Although code responding to thread may spin up some of its own)
- When using the lock(x) C# idiom, that lock is app domain-wide. So if
I'm locking to get at the value of a hashtable in the ASP cache, I'm
locking all other requests for this particular app pool until the lock
block exits. Correct?

If you've placed hashtable into the cache you will need to lock an object
(which can be the hashtable) before reading or modifying. (BTW, what
framework version are you targeting?)
- To follow the above, I assume I do have to lock anything that might be
accessed at the same time from two or more concurrent requests?

Typically you would cache immutable content. There is no need to lock such
content.
- When accessing properties of a class that contains only static
property accessors, do I need to lock?

Again it depends on whether the underlying fields may be modified. If so
then yes, the set of fields available to static properties and methods are
common across the app domain so changes to them need to be synchronised.
Assume the class is not accessing
anything that is shared internally, just other static variables.
- Assume the class is accessing a hashtable instance declared as a
private static variable, but still from a static method. Is the lock
necessary? i.e.:

public static class Foo {
private static Hashtable _ht;
...
...
public static int SomeProperty {
get {
// Would not locking here cause a contention problem?
return (int) _ht["SomeProperty"];
}
}
}

Yes you need locking. However you need to consider whether the such
operation is expensive enough or called often enough to be a real concern
form a concurrency POV.

It's probably clear by now that I'm trying to figure out how to best
store and access these key/value pairs, from a performance perspective.
It's also probably clear that I can't afford a lock at all.

How do you know you can't afford to lock?

I also
looked at HttpRequest.Items, however that does not help me because I'd
have to create (or clone) an instance of the config class for every
request, or keep a pool or something complicated like that.

A final question about locking and the ASP cache. If I have a cache item
which is a large string (maybe some prebuilt markup for a page that
rarely changes), and I access that string from an HTTP request to push
it out into the response stream, is the process of pulling that string
from the cache a locking situation as well? I.e., does the ASP cache
have to lock a primitive type in order to return it? And will this lock
affect concurrency at all in heavy load situations?

Strings are immutable in .NET. 'Copying' a string is a matter of copying a
pointer to a string. How large the string is has no impact on how long that
takes.
 
Hi Anthony --

Anthony said:
Yes. (Although code responding to thread may spin up some of its own)

OK, figured as much. I'm not forking any threads myself.
If you've placed hashtable into the cache you will need to lock an object
(which can be the hashtable) before reading or modifying. (BTW, what
framework version are you targeting?)

2.0, currently. Why? Is there a difference/advantage? I can probably go
to 3.x if there's a good reason for it.
Typically you would cache immutable content. There is no need to lock such
content.

By immutable content do you mean a string (as in the example below)?
Again it depends on whether the underlying fields may be modified. If so
then yes, the set of fields available to static properties and methods are
common across the app domain so changes to them need to be synchronised.

OK, but wait. Does that mean that if the fields are not being modified
then there's no risk of a collision between two different request threads?
Yes you need locking. However you need to consider whether the such
operation is expensive enough or called often enough to be a real concern
form a concurrency POV.

Well, I'm not sure about it. That's the problem. In a situation where
I'm serving just a few requests per second, a sub-millisecond lock is
probably not a big deal. But what's the impact for an app pool that's
serving lots of concurrent requests?
How do you know you can't afford to lock?

I don't, that's why we're having this conversation =)
Strings are immutable in .NET. 'Copying' a string is a matter of copying a
pointer to a string. How large the string is has no impact on how long that
takes.

OK, that's clear enough.

Thanks a million for your response!
 
Hi Patrice --
I addition to Anhony's reponse try :
http://msdn.microsoft.com/en-us/library/ms998549.aspx (Improving ASP.NET
performances).

Thank you! I'll give that a try.
You could pick here and there some usefull advices (in particular check
maxWorkThreads if you want a higher peak value as I wonder if this by chance
or not that you are at 2*maxWorkerThreads (the defautl value is 20), how
much CPU do you have ?)

Yes, I had to tweak the .config file a bit for this kind of load, but
everything seems to be working OK (crosses fingers).

The servers will probably be dual-core xeons with ~4GB of RAM. The DB is
a quad-core box with 16GB of RAM.
 
polastine said:
Hi Anthony --



OK, figured as much. I'm not forking any threads myself.


2.0, currently. Why? Is there a difference/advantage? I can probably go
to 3.x if there's a good reason for it.

Hashtable is old hat (.NET 1.1). Unless you've got some really good reason
use collection types from System.Collections.Generic instead of
System.Collections. If you are able to goto 3.5 then you should.
By immutable content do you mean a string (as in the example below)?

The purpose of the cache to store values that a costly to re-construct (such
as data created from querying a DB).
The stored values are not expected to change. It may be that they expire or
are invalidated and are therefore droped. Storing a hashtable the contents
of which is to be modified over time would not be in keeping with the cache
purpose. On top of which you would not want it to be dropped from the cache
so you would want to add it to the cache with the NotRemovable priorty, in
which case you may as well use a static field to hold the hashtable and not
bother with the cache at all.

OK, but wait. Does that mean that if the fields are not being modified
then there's no risk of a collision between two different request threads?

Yes, nothing will break if all you are doing is reading content that never
changes.

Well, I'm not sure about it. That's the problem. In a situation where
I'm serving just a few requests per second, a sub-millisecond lock is
probably not a big deal. But what's the impact for an app pool that's
serving lots of concurrent requests?

Define lots? How many properties will you reading per request and will they
each need to be individually locked?
Are you sure the lock is only sub-millisecond may not in be sub-microsecond?

To a large degree what you actually need to do comes down to the detail of
you design. For example could a bunch of properties be stored together in
structure (a value type) once aquired (under lock) the values in the
structure an be used without a lock.
I don't, that's why we're having this conversation =)

Unfortunately there isn't a known rule of thumb that can help, it really
does depend on a many factors such as how much code needs to run under a
lock.

There are two things you can do: design out the mutable shared values and
keep the amount of code running under a lock as small as possible. Usually
the latter can be just the assignment to the shared store that needs a lock.
I doubt you will need to do anything super clever because of contention
problems.

If you goto 3.5 you can use the generic SynchronizedKeyedCollection if you
remain with 2.0 then I would say you need to create a synchronising wrapper
for the generic Dictionary. Don't bother with the Cache and hold this
object statically.

OTH if what you are really doing is holding HTML fragments the perhaps you
should be using the cache directly (not storing a collection in the cache
but storing stings containing HTML). Learn how to set up expiration and/or
dependancies so that stale or defunct entries are dropped.
 
Back
Top