William said:
This debate has gone on for years. The SQL Server team at MS has done
what it can to make ad-hoc queries perform better and yes, they
compile to the same execution plans as SPs. Those plans (the compiled
code if you will) are cached in the same memory as SP plans. Yes,
there are aspects of SPs that make them more efficient as I outline
below. However, the real difference is not always in execution
performance but in developer productivity. My research has found that
the vast majority (over 90%) of the shops out there use SPs over
ad-hoc queries.
depends on how you produce your ad-hoc queries of course. Any o/r
mapper can produce queries which often outperform CRUD procs and also
offer big development speed gains over hand-written procs for each
table.
They use them because:
a.. SPs can be changed insitu without recoding, retesting and
redeploying an application.
this is specific to the situation they're used in. Often procs can't
be changed without altering the calling code as well, which means
changes at multiple places, and worse: done by different people not
working together (in a lot of cases), which often means delays,
politics and who controls what (is proc signature change leading
calling code changes or vice versa?).
Furthermore, altering a proc is often argued as a good thing and an
advantage but in reality this is a risky business: a proc can be called
from a lot of routines in your application. So altering the proc is
often not possible, so a NEW proc is required. This adds maintenance
nightmares because you then have to document which parts of the
application calls which proc (if there are more apps using the same
procs, it's even more trickier).
b.. The require discipline within the organization to ensure that
changes made to the database and its schema are done in an orderly
fashion.
what has this to do with stored procedures?
c.. They are more secure as DBAs rarely permit direct
access to base tables but do grant limited rights to SPs.
yeah, pr_DeleteCustomer(@customerID) is really more secure... when do
people stop bringing up this myth over and over again? If Joe uses an
application X and X logs into the DB using Joe's account (or whatever
account build in, doesn't matter) and X offers Joe to delete a customer
in the scope of the application, Joe can delete the customer (or any
customer) from the db using the proc called by X via a normal query
toolkit and the same user as X uses, as X simply calls a proc. Joe can
do:
EXEC pr_DeleteCustomer(10)
and customer with id 10 is removed. How is pr_DeleteCustomer prevent
Joe from doing that?
d.. They encourage developers to use Command object Parameter lists
that deal with a litany of security and formatting issues.
You think?
I do
using(DataAccessAdapter adapter = new DataAccessAdapter())
{
adapter.SaveEntity(myCustomer);
}
and it saves myCustomer, its orders, each order detail row, related
product row and other rows in the right order, syncs FK's etc. etc.
With Dyn. sql. Where are the parameters? I don't see them. And the code
above is database agnostic code. I can create a simple factory for my
DataAccessAdapter object which makes the code above save the whole
graph in sqlserver or oracle or other db.
And on top of that: with insert and update statements (which one
applies) which can often be faster than a proc call, simply because
they only update / insert the fields which have to be updated. So,
let's say your customer table has 10 fields, your order table has 10
fields. Your update procs for customer and order then either accept ALL
field values and simply update all fields, OR accept nullable
parameters and do COALESCE voodoo, OR contain if statements (a no-go,
but some people still write them)
Now tell me, can you beat those 4 lines of code (2 actually) with your
procs and call-code behind the scenes, fully debugged and tested? Is
that more productive?
Then I didn't even throw in automatic concurrency code added to the
query, when necessary.
Besides that, how are you going to call procs without parameter lists?
e.. They find
that OUTPUT parameters are far more efficient than rowsets (returned
by a SELECT).
sadly for every context in the application where a value from the db
is required, a new scalar or output parameter proc has to be written to
feed the app with data required.
With an application of say 200 tables or more, this will likely run
into the 1000+ procs (with the crud ones, the select all/one/filter
ones). I'm sure you will agree with me that if you have a proc tier
with 1000+ procs maintained by group A and the actual application
maintained by group B, it will get messy over time.
f.. They find that now that SPs can be written in
VB.NET or C# some very tough jobs can be done on the server instead
of on the client. This saves transporting data in bulk to the client
for this processing--again improving performance. I could go on (and
I do in my books), but you get the idea.
And finally we come to the one point where procs are the best option:
data processing in long running queries.
What I always find funny is that in debates between dyn. sql and procs
this one point isn't used by the pro-proc (let me call the group
favoring procs in the debate the 'pro proc people') people, or at least
later on in the discussion (if ever). And data-processing is one of the
reasons why you could opt for a proc for a given piece of
functionality. (for the readers who wonder why: the roundtrips of
transfering data back/forth to the client app to do processing there is
often slower than doing the processing in interpreted (!) SQL code.
With C#/VB.NET code on the server, it's even more clear that
long-running processing code can help). One caveat: C#/VB.NET code
maintenance on the server isn't that friendly.
Bill, what I find weird in your plea, if I may call it that, is that
you apparently ignore the downsides of procs which can hurt
productivity and maintainability of an application that much that procs
are not that good of a choice for CRUD operations. I especially make
that distinction: CRUD vs dataprocessing, because it makes the
discussion more clearer: most of these debates go about CRUD, not
dataprocessing.
With CRUD procs, you run into problems when your app needs these kind
of things:
- fetch n rows of X filtered on related tables Y and Z using parameters
A, B ...
- fetch n rows of X, all these n rows' related Y and Z rows
- fetch rows of X, filter them and order them on field a
AND then:
- fetch rows of X, filter them and order them on field b
etc.
of course you can write special procs for these situations, but with a
database of a regular enterprise size, you quickly run into the problem
that you have to write a lot of custom procs. The reason for that is
that it highly depends on the application's context in which the data
is required how the actual resultset looks like and on which set logic
it is based. This thus leads to DBA <-> developer discussions, which is
overhead, can lead to conflicts (often does) and thus delays.
Coming back to 'developer productivity' which was your main point at
the start of your post, I don't see it. On the contrary, it's not
there, it takes longer, EVEN with hand-written queries in code. Anyone
who has done development of n-tier apps which call procs (I have) knows
that if your app needs a change and the change requires different data,
the proc-change is a roadblock: you either have to maintain a proc AND
call code yourself (if you may alter the procs) OR have to request a
proc change or a new proc, if the proc can't be changed.
I've to add that I also don't think hand-writing raw SQL in your
application is a good way to do data-access, however today that's also
not necessary anymore. With the large amount of good, free (or
commercial) o/r mapping frameworks out there, for every development
team there's a solution available which fits their budget and frees
them from writing raw SQL in their apps AND gives them full development
productivity: they don't have to wait for the DBA to write the proc
they want. They don't have to debate with the DBA for hours to convince
him/her that the proc is necessary, they don't have to wait till the
proc is updated to bring the call code to the proc in production and
they don't have to wait till the DBA has tested the proc code through
and through: their SQL _is_ tested.
DBA's are necessary, also with o/r mapper using teams. Though instead
working against them (in a lot of organisations the DBA is placed in
another section of the organisation as the development team(s)) they
should work together, after all they work for the same organisation.
The DBA should consult the team when to use a join-based query over
relationships and when to use a subquery for example (most o/r mappers
out there (except dlinq and a few others) offer choices what to create:
a subquery or a join, for example in 1:n fetches with filters on the n
side).
I don't have the illusion that the 'use procs because <insert myth
list here>' arguments will go away soon or even will go away in the
future. What I do hope though is that people are more realistic about
the various options they have when it comes to data-access: by using
the real pro/con's of each option and use the techniques available per
functionality aspect. I think that debate is much more healthier AND
more valuable to developers than the bickering over which side is right
and which side is wrong.
FB
--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website:
http://www.llblgen.com
My .NET blog:
http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------