Efficient data retrieval techniques

Eniac · Apr 7, 2006

Hiya guys,

earlier on today I had a discussion (to avoid calling it an argument
^.^) with my friend. We were discussing about whether there's really
one good way to efficiently retrieve data from a database.

To sum it up, his opinion was that the only good way to retrieve data
is to set your command object, open the connection, retrieve the data
and close it.

so in the code, it would look like this more or less...

********
class A :

combo1.Datasource = classB.GetCategories
combo2.Datasource = classB.GetTypes
Combo3.Datasource = classB.GetLanguages

class B :

public function GetCategories as dataset
....create command with parameters
....open db connection
....execute
....close connection
end function

public function GetTypes as dataset
....create command with parameters
....open db connection
....execute
....close connection
end function

public function GetLanguages as dataset
....create command with parameters
....open db connection
....execute
....close connection
end function
*****

on the other hand, my opinion is there isn't just one good way of doing
things, for instance, it might be better to just leave the connection
open while you proceed with all the database operations in sequence
then close the connection.

so it would look more or less like this :

class A :

classB.OpenConn()

combo1.Datasource = classB.GetCategories
combo2.Datasource = classB.GetTypes
combo3.Datasource = classB.GetLanguages

classB.CloseConn

class B :

public sub OpenConn
....open connection (global)
end sub

public sub CloseConn
....close connection (global)
end sub

public function GetCategories as dataset
....create command with parameters
....execute
end function

public function GetTypes as dataset
....create command with parameters
....execute
end function

public function GetLanguages as dataset
....create command with parameters
....execute
end function

His argument to back his claim is that his way is what is taught on
MSDN and most "expert" articles. He says that this is how scalable
systems are built because it allows the connection pooling object to
really manage efficiently the available connection.

Which is a very good point I admit. But I find that this technique is
most appropriate when there are non-database operation in-between the
various calls. Its also a one-size fits all type of architecture, tried
and true but unoptimized. Like a generic cd-rom vs a specific cd-rom
for a cd-rom drive.

What I was saying is that using that technique, even with the pooling,
opening and closing a connection object repetively is an overhead if
you know that all your database operations are done in sequence.
Whatever little time you spend creating & destroying the object are
lost microseconds if you know that right after, you'll be
re-instantiating that very same object and that in that respect, it was
pointless to destroy it.

What do you guys think, is there really one way to do things, my
opinion is that we're both a little right and a little wrong at the
same time.

David Browne · Apr 7, 2006

Eniac said:
Hiya guys,

earlier on today I had a discussion (to avoid calling it an argument
^.^) with my friend. We were discussing about whether there's really
one good way to efficiently retrieve data from a database.

To sum it up, his opinion was that the only good way to retrieve data
is to set your command object, open the connection, retrieve the data
and close it.

so in the code, it would look like this more or less...

********
class A :

combo1.Datasource = classB.GetCategories
combo2.Datasource = classB.GetTypes
Combo3.Datasource = classB.GetLanguages

class B :

public function GetCategories as dataset
...create command with parameters
...open db connection
...execute
...close connection
end function

public function GetTypes as dataset
...create command with parameters
...open db connection
...execute
...close connection
end function

public function GetLanguages as dataset
...create command with parameters
...open db connection
...execute
...close connection
end function
*****

on the other hand, my opinion is there isn't just one good way of doing
things, for instance, it might be better to just leave the connection
open while you proceed with all the database operations in sequence
then close the connection.

so it would look more or less like this :

class A :

classB.OpenConn()

combo1.Datasource = classB.GetCategories
combo2.Datasource = classB.GetTypes
combo3.Datasource = classB.GetLanguages

classB.CloseConn

class B :

public sub OpenConn
...open connection (global)
end sub

public sub CloseConn
...close connection (global)
end sub

public function GetCategories as dataset
...create command with parameters
...execute
end function

public function GetTypes as dataset
...create command with parameters
...execute
end function

public function GetLanguages as dataset
...create command with parameters
...execute
end function

His argument to back his claim is that his way is what is taught on
MSDN and most "expert" articles. He says that this is how scalable
systems are built because it allows the connection pooling object to
really manage efficiently the available connection.

Which is a very good point I admit. But I find that this technique is
most appropriate when there are non-database operation in-between the
various calls. Its also a one-size fits all type of architecture, tried
and true but unoptimized. Like a generic cd-rom vs a specific cd-rom
for a cd-rom drive.

What I was saying is that using that technique, even with the pooling,
opening and closing a connection object repetively is an overhead if
you know that all your database operations are done in sequence.
Whatever little time you spend creating & destroying the object are
lost microseconds if you know that right after, you'll be
re-instantiating that very same object and that in that respect, it was
pointless to destroy it.

What do you guys think, is there really one way to do things, my
opinion is that we're both a little right and a little wrong at the
same time.

My opinion is that there is no significant performance benefit in closing
and reopening connections in the same method (or thread stack). Typically
the duration of the entire method is so short (other than database waits)
that you don't significantly reduce the number of database connections in
the pool. Moreover incremental reductions in the size of the connection
pool have no incremental performance benefit. So if you have a method that
makes multiple database calls, or which calls a method which makes multiple
database calls, etc, there is no performance benefit to the application to
close and reopen the connection for each operation.

The reason we keep connections open for short durations is to reduce the
total number of open connections to the databse. The total number of open
connections to the database can become a performance problem beyond certain
thresholds, because open and idle connections to the database consume memory
on the database server, and memory on the database server was historically a
very limited resource. Basically, connection pooling and releasing
connections to the pool are the strategy to allow applications to have many
more concurrent users than the database can support in concurrent
connections. Connection pooling does this job just fine if you keep
connections open for the duration of a method call (or page render, or
webservice method, etc).

David

Cor Ligthert [MVP] · Apr 7, 2006

Eniac,

I think that I in general agree with you.
By instance the dataadapter has its own open and closing feature.

If you want to load 3 tables in one sequence, than is AFAIK it more
efficient to do.

Conn Open
Fill table1
Fill table2
Fill talbe 3
Close Connection

And not to use the in build feature that opens and closes than at every
fill.

Just my thought.

Cor

tommaso.gastaldi · Apr 7, 2006

I also have been struggling with this question. At the beginning I was
alwais afraid to close and dispose everything. After several
experiences and several tons of code I think now it's alway better to
leave the connection open if you are going to do other operations. The
only thing, I think it's safe to put a kind of

Function OpenConnectionIfNecessary() as boolean

function before each operation, which checks whether the connection is
open and tries to open it if not (or handles appropriately possible
disconnections...)

The fact that all the example show close / dispose is understandable
because they are stand alone example.

As to dispose, also I think that connections that are going to reopened
shortly should never be disposed of. Actually I tend to reuse all
connections, avoiding to dispose them. I think it can only be source of
troubles...

just my 2 cents

Cor Ligthert [MVP] ha scritto:

Eniac · Apr 10, 2006

Thanks guys, I'm relieved to realize I'm not the only one viewing
things like this. I find it hard to back up my claims tho, because all
articles that explains how to handle a database are example usings the
open-execute-close technique.

Maybe some MS MVP guy would have a link to a technical paper that would
help me prove my point ? (or prove me wrong...either way)

I'm not so much concerned about proving I'm right (or wrong) but I'd
like to find something that really explains how's its done or what
technique is best depending on a given situation.

William \(Bill\) Vaughn · Apr 10, 2006

My books and articles support David's approach. Making (and breaking)
connections is NOT free. If you're in a routine that fully expects to make
use of the connection, keep it open. However, if the page ends and the
connection is left open by accident you'll leak connections. This is not an
issue with Windows Forms applications where it's far easier to simply open a
connection and leave it open. In ASP applications, it is important to keep
the number of active connections to a minimum--using JIT open/process/close
is a safe approach but can actually hurt performance if your code is not
cognizant of the bigger picture.

hth

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

David Browne · Apr 10, 2006

William (Bill) Vaughn said:
My books and articles support David's approach. Making (and breaking)
connections is NOT free. If you're in a routine that fully expects to make
use of the connection, keep it open. However, if the page ends and the
connection is left open by accident you'll leak connections. This is not
an issue with Windows Forms applications where it's far easier to simply
open a connection and leave it open. In ASP applications, it is important
to keep the number of active connections to a minimum--using JIT
open/process/close is a safe approach but can actually hurt performance if
your code is not cognizant of the bigger picture.

hth

BTW has everyone seen Alazel Acheson's DbConnectionScope class?

http://blogs.msdn.com/dataaccess/

Very cool.

David

Eniac · Apr 10, 2006

Thanks guys. after reading everyone's comments and also reading this
group :
http://groups.google.com/group/micr...browse_thread/thread/791a79365a034df4/?hl=en#

I realized that while my technique was still good that I'd better off
with the JIT approach. basically here's a very simple version of how my
code was :

Class Profiles {

database = new clsDataAccessLayer(constantfile.connectionstring)

database.openconn()

loop sections
loop datasets
loop rows
select case section
case section 1
myResults = database.getDataForSection1

case section 2
myResults = database.getDataForSection2

case section 3
myResults = database.getDataForSection3
end select
end loop
end loop
end loop

database.closeconn()
database = nothing
}

for my application, I've put all the database operations at a single
point. So i thought it would be more efficient to leave the connection
open. But i came to realize that in my code, 3 indented loops with a
few lines in between can become slow if you put a few hundreds (or
more) concurrent users. meaning that im hogging connections that could
be useful to another instance.

So ive changed my code so that every call to the database class opens a
connection, execute and close the connection (dispose)

my only remaining question is this :
I know that the pool will be destroyed when the active process ends, in
my case, when the call to my webservice ends.

if i have 10 different people calling my webservice, on the same
server, will there be 10 different connection pools or will the pool be
the same for those 10 users ?

Efficient data retrieval techniques

Eniac

David Browne

Cor Ligthert [MVP]

tommaso.gastaldi

Eniac

William \(Bill\) Vaughn

David Browne

Eniac