Data caching question

  • Thread starter Thread starter Gary W. Smith
  • Start date Start date
G

Gary W. Smith

I had a couple questions about data caching. We have a site that gets
a huge amount of traffic to a few specific pages that have a lot of
data on them (300k hits/hour during peak, about 6-10 data calls per
page OR 500 SQL calls per second). Most of the data doesn't change
more than once ever 10 minutes. Basically they are data islands with
specific content. We have already implemented a singleton that
manages a connection pool. This has sped up things a big but the
database still gets a large number of hits. This is affecting
performance of other applications as the web site is pounding away
with these hits.

I was thinking about making another caching singleton that would cache
SQL statements, their data results (which are usually no more than 10
rows with about 3k of information total -- not including overhead),
the time when the statement was last run and the timeout. Keeping
these items in a synchronized Hashtable (using the appropriate locks),
it should be manageable.

My intent was to make a call to the CacheDB object, which would see if
the item exists in the cache, if not, request the data, lock the
Hashtable and insert it. For each subsequent request, it would either
find it or it wouldn't. Then, the CacheDB object would have a timer
running on callback that will iterate through the cached objects some
find the ones that are beyond their allotted lifetime, and in the
background, re-execute the SQL statement, lock the Hashtable, and
update the object.

My primary concerns are the memory usages of in memory tables
(DataTable objects), their impacts of doing a DataTable.Clone()
(possible locking and synchronization issues) and also running
multiple timer callbacks in ASP.Net.

The first question/concern:

We already run one timer in another singleton object (the connection
pooling object) so I don't know what the impact of a second timer
would be. The connection pooling timer is set of 300 seconds, and I
was thinking about the same for the caching object.

The second question/concern:

Keeping these small DataTables around in memory shouldn't require too
much in resources but we will be doing a lot of DataTable.Clone()'s.
I don't know if this is thread safe and what the memory impacts will
be.

Anyone have any experience with this?
 
Gary,
From a quick "first read" it seems that you may have an awful lot of custom
"Stuff" going on here. What do you need a custom connection pool for when
there is already one built in to ADO.NET? Have you tried just using
SqlCacheDependencies for your data access? How about output page caching?
All this stuff is built in to ASP.NET
-- Peter
Recursion: see Recursion
site: http://www.eggheadcafe.com
unBlog: http://petesbloggerama.blogspot.com
BlogMetaFinder: http://www.blogmetafinder.com
 
Anyone have any experience with this?- Hide quoted text -
ADO.Net would be nice... Can't go too much into the application as
I'm bound by the usual terms, but we aren't using MS SQL.

Here is the short story, though we can use connection pooling, we
still have a large number of open connections at any one time and
since all transactions are stateless, we created a singleton so we can
ride the same connection object over and over again. It reduces the
connection prep time a little, and we have seen about a 20% increase
over native pooling. The pool is also dynamic so connections that are
more than 5 minutes old get removed and if there are no available
connections, new ones get created. You know, old school connection
pooling... :)

We have thought about output caching, but the problem is the content
is designed into the page. We are trying to break them into smaller
data islands, but this won't be for a while. So for the average user,
they might have 6 to 10 islands of data. Let's say there are about
1000 variants of the data at any point in time. Currently, after we
identify which data the user gets, we retrieve it from the database.
This is where we are getting the hit at. The business unit has
accepted 10 minute delays from the time the data changes to the
presentation. So my goal is to reduce the overall database load while
the other team starts to refactor this application (probably
incorporating AJAX, or some other technology). In the mean time, I'm
approaching from reducing the data calls.

The data request load is still the biggest problem. We could, in
theory, serialize the datatables, or possibly save it into application
scope, but this also runs into some locking issue and also the issue
of updating, but this will increase the load on the web farm. So the
idea was to keep the DataTable in memory and just DataTable.Copy() (I
used clone earlier by mistake) when an ASP page calls for it.

Now I have create a custom class since the original post, which does
the simple DataTable caching, which will see if a macthing SQL
statement exists, and if it does then send it to the caller, otherwise
retrieve it. The query itself takes about 300ms, but once it's cached
I can iterate it about 7k times per second on my workstation
(including locking, etc -- and only hitting the database the first
time). It's also set to expire (and auto refresh the content). I'm
just looking for more elegant solutions and variations. I'm still
looking into the memory implications though.
 
I forgot to mention that I really wanted something that would be self
updating, so I was staying away from cache[] in the web page. We've
played around with the native cache within asp.net and we had some odd
results (partial data), slowness of pages being loaded, etc. But as I
mentioned, this is more of a stop gap measure.
 
Back
Top