Is 'connection pooling' obsolete given MARs?

  • Thread starter Thread starter Lecture Snoddddgrass
  • Start date Start date
L

Lecture Snoddddgrass

Hi,

I was just reading about the new MARS (multiple active results set)
technology that will be included in the forthcoming ADO.NET 2.0. If I'm
understanding it correctly, with the new ADO.NET, you'll be able to execute
multiple queries simultaneously over the *same connection.* That's very
cool. I'm wondering if this means that there will no longer be need for a
connection pool within a given process since in theory, that process will
only have one connection object (per database) and every thread that needs
to talk to the database will just keep executing against that same physical
connection. Is this correct?

I leave you with a short poem....

Lecture
 
Connection pooling in the same process (as in client/server) is not nearly
as interesting (and useful) as in the middle tier or in ASP architectures.
Yes, MARS is an important step forward but I suspect that the pool will
still play a role. Client should still close connections when they aren't
being used to (eventually) free _server-side_ connections for others to
use. Reopening these connections before they are purged by the pooler will
improve performance.
In the ASP case, MARS is not as interesting as there are not as many ASP
applications that run multiple threads or open multiple connections.
However, the connection pool in ASP plays an important role.

hth

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
 
Q> Is 'connection pooling' obsolete given MARs?
A> Not at all.

There are many good things about Mars and it is great that we are finally
adding this functionality, it is going to enable scenarios that where hard
to achieve before and if you know what you are doing can have performance
benefits.

However:
Mars is not thread safe, you cannot create one connection and execute MARS
commands on separate threads.
Mars multiplexes the connection result set buffer, eventually it is going to
be more expensive to multiplex than to open a new connection.
Mars executes synchronously on the server.
Mars is not enabled for all scenarios:
SqlClient against Yukon ->Mars enabled
SqlClient against Sql 2000 or Sql 7 -> no MARS support.
Oledb using MDAC 9 against Yukon -> Mars enabled
Oledb against Sql 2000 or Sql 7 -> Mars support is _faked_ , we will
open a new connection under the covers.


Likely MARS scenarios:
Open a datareader and read n rows, use value in n+1 row to query the
database for more information. This scenarios is particularily good when the
initial datareader is reading data that has been locked by the current
connection.

Combine MARs and Async for GUI development.
 
Thanks for the great information. I have some follow-up questions for
you:
Mars is not thread safe, you cannot create one connection and execute MARS
commands on separate threads.

So let's say that I have 50 SqlCommand objects in an array that need to
be executed. Would I be able to loop through that array calling
BeginExecuteNonQuery() on each one even if each SqlCommand were associated
with the same connection?
Mars multiplexes the connection result set buffer, eventually it is going to
be more expensive to multiplex than to open a new connection.

What does "multiplex" mean?
Mars executes synchronously on the server.

Ok, I think that's the deal-killer right there. So let's say that I use
MARS to execute 50 queries on the same connection. The first query takes 10
seconds to run. The other 49 queries could be performed in .10 seconds.
You're saying that the 49 quickie queries would have to wait until the first
one finishes? Just want to make sure I understand you.
Likely MARS scenarios:

Well, my scenario is a bit different than what you've described. I have
a service that gets hit with hundreds of requests per second. Each request
comes in on a thread-pool thread. Some of these requests require a SELECT
query against SQL Server. As it stands, these SELECT queries can't be done
in a true async fashion. They block. Given that there are only 25 threads in
the thread pool, if I'm working on 25 requests that each need to access SQL
Server, and each of these threads is blocked waiting for SQL Server to
respond, my service is just sitting idly blocking unable to respond to other
requests that are coming in. A more efficient way to handle this would be if
when a request came in I could call some sort of BeginFill() [to fill a
dataset] method which would immediately return and then would call a
callback method when the query is finished. That way, my request handler
thread wouldn't block. It would be returned to the pool to do other tasks
and then when the query results were ready, I could deal with the results.
What's really killing the scalability of my service right now is ADO.NET.
Will ADO.NET 2.0 be able to help me with this problem? If so, I'll be very
happy!

Thanks!
 
Comments Inline, thanks for the feedback.

--
Angel Saenz-Badillos [MS] Managed Providers
This posting is provided "AS IS", with no warranties, and confers no
rights.Please do not send email directly to this alias.
This alias is for newsgroup purposes only.


Lecture Snoddddgrass said:
Thanks for the great information. I have some follow-up questions for
you:


So let's say that I have 50 SqlCommand objects in an array that need to
be executed. Would I be able to loop through that array calling
BeginExecuteNonQuery() on each one even if each SqlCommand were associated
with the same connection?

Yes, you would be able to do this because BeginExecuteNonQuery is a true
async call. It does not start a new thread in your process so all fifty
commands would be running in the same thread. You still have to consume the
results on the same thread though, so you can waitone /waitall and you can
poll, but you cannot use callbacks directly. You will run into problems if
you try to use the Callback option since callbacks happen in a different
thread. There are probably going to be workarrounds for this like calling
back to the main thread from the callback.

going

What does "multiplex" mean?

Each connection has a tds buffer associated with it, think of it as a
bi-directional queue where both the server and the client push data to the
other. This queue can only be used in one direction at a time which is why
up to now we have not been able to have multiple resultsets. With Mars we
have modified this buffer to work as if it was more than one buffer, we have
multiplexed it. This is not an expensive operation but eventually it becomes
more expensive to make the one buffer work as n buffers than to open a new
connection.
Ok, I think that's the deal-killer right there. So let's say that I use
MARS to execute 50 queries on the same connection. The first query takes 10
seconds to run. The other 49 queries could be performed in .10 seconds.
You're saying that the 49 quickie queries would have to wait until the first
one finishes? Just want to make sure I understand you.

This could be the worst case scenario yes, all MARS sessions are sharing the
same process space.
Likely MARS scenarios:

Well, my scenario is a bit different than what you've described. I have
a service that gets hit with hundreds of requests per second. Each request
comes in on a thread-pool thread. Some of these requests require a SELECT
query against SQL Server. As it stands, these SELECT queries can't be done
in a true async fashion. They block. Given that there are only 25 threads in
the thread pool, if I'm working on 25 requests that each need to access SQL
Server, and each of these threads is blocked waiting for SQL Server to
respond, my service is just sitting idly blocking unable to respond to other
requests that are coming in. A more efficient way to handle this would be if
when a request came in I could call some sort of BeginFill() [to fill a
dataset] method which would immediately return and then would call a
callback method when the query is finished. That way, my request handler
thread wouldn't block. It would be returned to the pool to do other tasks
and then when the query results were ready, I could deal with the results.
What's really killing the scalability of my service right now is ADO.NET.
Will ADO.NET 2.0 be able to help me with this problem? If so, I'll be very
happy!

Thanks!

That's a tough scenario I will have to take a longer look at it. There
should be a way to get async to work for you here, but it is probably not
going to be something off the shelf.
 
Your comments on connection thread safety kind of bum me out, but at
least now I understand why it has to be that way. Some of the books/articles
out there are suggesting that using the async callbacks [then calling
EndExecuteNonQuery() on the thread pool thread] is an acceptable way to
handle things. Well, you're still in the alpha stage so I guess I should cut
them some slack! :)
That's a tough scenario I will have to take a longer look at it. There
should be a way to get async to work for you here, but it is probably not
going to be something off the shelf.

I'd like to hear more of your thoughts on this. This issue has been
plaguing me for some time. You seem to know this ADO.NET stuff really well
and I'd be thrilled to hear your suggestions.

Thanks!
 
==Disclaimer, the information in this post is only relevant for the PDC
alpha of v2.0 Whidbey. Features may change considerably before the beta
ships.

This is going to be a tough area. The books/articles are correct, you can
definitely call EndXXX on the callbacks, that is what the callbacks are
there for. What you can't do is use an ado.net object (or any other non
thread safe object) in multiple threads. So, as long as you are not using
the command or connection object while you are waiting for the callback,
everything will work as expected. In pseudocode:

This will work:
con.Open()
command.BeginExecute(callback)
//do work here, do not use command or connection.*

//on a different thread
on_callback_called
command.EndExecute.

(*If you were not using the callback and where using the waitone/all you can
continue to use connection, create new command and execute on them with
mars)

If you where to use the command or connection while waiting for the callback
there is the possibility of running into multi-threaded issues. The worst
part is that you will get no exception and it will probably work perfectly
while you are testing the app. Then you will deploy and get the strangest
stress issues! I am looking for suggestions on what can be done to avoid
this.

This problem is going to be there for all non thread safe objects, not just
ado. The first thing I tried to do was do a BeginExecuteReader, fill a table
and bind this to a winform grid on the callback. Of course no go because the
table is non thread safe and I am touching it in both the main thread and
the callback. I thought the whole feature was broken until somebody showed
me this workaround:

delegate void UICallback( object param );

void ExecCallback( IAsyncResult ar )
{
using( SqlDataReader r=_cmd.EndExecuteReader( ar ) )
{
DataTable t=new DataTable();
t.Load( r );
this.Invoke( new UICallback( ReBindOnUIThread ), new object[] { t } );
}
}
}

void ReBindOnUIThread( object param )
{
if( param is DataTable )
{
dataGrid1.DataSource=param;
_cmd=null;
}

}
ReBindOnUIThread executes in the main thread so we can bind to the dataGrid
and still be thread safe!

With this I think you have all the tools to solve your problem. Use a
different connection in each of your threads and don't use it while waiting
for the callback, use delegates to rebind to the main thread where
necessary.
Hope this helps,
--
Angel Saenz-Badillos [MS] Managed Providers
This posting is provided "AS IS", with no warranties, and confers no
rights.Please do not send email directly to this alias.
This alias is for newsgroup purposes only.

Lecture Snoddddgrass said:
Your comments on connection thread safety kind of bum me out, but at
least now I understand why it has to be that way. Some of the books/articles
out there are suggesting that using the async callbacks [then calling
EndExecuteNonQuery() on the thread pool thread] is an acceptable way to
handle things. Well, you're still in the alpha stage so I guess I should cut
them some slack! :)
That's a tough scenario I will have to take a longer look at it. There
should be a way to get async to work for you here, but it is probably not
going to be something off the shelf.

I'd like to hear more of your thoughts on this. This issue has been
plaguing me for some time. You seem to know this ADO.NET stuff really well
and I'd be thrilled to hear your suggestions.

Thanks!
 
Back
Top