Extreme performance issues (SQL Server 2000/ADO.NET/C#)

  • Thread starter Thread starter Jon Rista
  • Start date Start date
J

Jon Rista

I'm using ADO.NET in a windows service application to perform a process on
SQL Server 2000. This process runs very quickly if run through Query
Analyser or Enterprise Manager, but takes an excessively long time when run
through my application. To be more precise, executing stored procedures and
views through Query Analyser take between 10 and 20 seconds to complete. The
same exact stored procedures and views, run in the same exact order, through
my program, take anywhere from 30 minutes to 2 hours to complete, and the
system that runs SQL Server (a 4-cpu Xeons system with 2gigs of physical
ram) is pegged at 25% cpu usage (the query uses 100% of a single cpu's worth
of processing power). I am at a complete loss as to why such a vast
difference in execution time would occurr, but here are some details.

The windows service executes on a workstation.
SQL Server 2000 executes on a server different from the workstation through
a 100mbps ethernet network.
Query Analyser/Enterprise Manager run on the same workstation as the windows
service.

The process is as follows:

1) Run a stored procedure to clear temp tables.
2) Import raw text data into a SQL Server table (Reconciliation).
3) Import data from a Microsoft Access database into 3 SQL Server tables
(Accounts, HistoricalPayments, CurrentPayments).
(This takes about 10 - 15minutes to import 70,000 - 100,000 records from
an access database, housed on a network share on a different server.)
4) "Bucketize" the imported data. This process gathers data from the 4
tables stated so far (Reconciliation, Accounts, HistoricalPayments,
CurrentPayments, and places records into another table (Buckets) and
assigned a primary category number to each record through a stored
procedure.
5) Sort buckets of data into subcategories, updating each record in
(Buckets) and assigning a sub category number, through another stored
procedure.
6) Retrieve a summary of the data in (Buckets) (this summary is a count of
rows and summation of monetary values), grouped by the primary category
number. This is a view.
7) Retrieve a summary of the data in (Buckets), grouped by both the primary
and sub category numbers. This is a view.


When I execute these steps manually through query analyser, (save step 3),
each query takes anywhere from 1 second to 20 seconds. The views,
surprisingly, take more time than the fairly complex stored procedures of
step 4 and 5.

When I execute these steps automatically using my windows service (written
in .NET, C#, using ADO.NET), the simple stored procedures like clearing
tables and whatnot execute quickly, but the stored procedures and views from
steps 4-7 take an extremely long time. The stored procedures take at a
minimum 30 minutes to complete, and sometimes nearly an hour. The views are
the worst of all, taking no less than 1 hour to run, and often two hours
(probably longer, actually, since my CommandTimeout is set to 7200 seconds,
or two hours). I have never seen such a drastic difference between the
execution of a query or stored procedure between query analyser and an
application. There should be little or no difference at all, considering
that everything is stored procedures (even the views...I wrap all the views
in a simple stored procedure that calls the view using a SELECT), and as
such executes on the server. Not only that, but Query Analyser is running on
the same exact box that the application is running on, and is connecting ot
the same SQL Server.

I doubt this is a network bandwidth issue, as after calling the stored
procedure from code, there is no network activity except mssql keep-alive
messages, until the procedure completes and returns its result set or return
value (if any), and then its only a momentary blip as the data is sent
accross.

I've followed proper practice when using views and stored procedures. When I
select, I always explicitly name the columns I wish to retrieve. I have
appropriate indexes on the columns in the 4 data tables. The queries that
execute in the stored procedures are fairly complex, involving summations,
count(), group by, and order by. I can understand a moderate difference in
performance between query analyser and an ADO.NET application due to
ADO.NETs extra overhead, but a difference between 20 seconds and 1 hour is
more than can be attributed to .NET overhead.

I greatly appreciate anyone who might have some insight to this offering
some help. I've scanned the net looking for similar situations, but
searching for them is somewhat difficult, considering the nature and volume
of factors. Thanks.

-- Jon Rista
 
Jon said:
I'm using ADO.NET in a windows service application to perform a process on
SQL Server 2000. This process runs very quickly if run through Query
Analyser or Enterprise Manager, but takes an excessively long time when run
through my application. To be more precise, executing stored procedures and
views through Query Analyser take between 10 and 20 seconds to complete. The
same exact stored procedures and views, run in the same exact order, through
my program, take anywhere from 30 minutes to 2 hours to complete, and the
system that runs SQL Server (a 4-cpu Xeons system with 2gigs of physical
ram) is pegged at 25% cpu usage (the query uses 100% of a single cpu's worth
of processing power). I am at a complete loss as to why such a vast
difference in execution time would occurr, but here are some details.

The windows service executes on a workstation.
SQL Server 2000 executes on a server different from the workstation through
a 100mbps ethernet network.
Query Analyser/Enterprise Manager run on the same workstation as the windows
service.
(snip)

In theory, a proc executed from ADO.NET should perform the same as when it
is called from QA. The problem is: ADO.NET calls a proc differently (uses
exec sp_executesql. Check with SqlServer profiler). This might be a
difference.

Also, QA keeps the connection open. While this might not be a difference per
se, but it can be that your application opens/closes a connection each time.

Also check if you can put extra options in the connection string when you
look at the QA connection options.

QA uses OleDb, however what you're doing is not provider bound, as all
actions are executed on the server...

Have you performed any profiling on teh server? (use the performance
counters added by sqlserver for detailed analysis)

Frans.
 
Hi Frans, and thanks for the reply. My application currently maintains a
open connection throughout this process. It opens it when the process is
triggered, and closes it when the process is complete, due to the volume of
queries run in what is supposed to be a short time.

When it comes to query strings....my current string is as follows:

workstation ID=DEVELOPER1;packet size=4096;integrated security=SSPI;data
source=ARMDEV;persist security info=False;initial catalog=CertusARM

I'm not sure if the workstation ID will cause any problems, but the
connection is set up through a component, and workstation ID gets added
regardless (pisses me off, too).

I have run some profiling on the server. Not much happens when I profile the
process as it runs from my application. You'll see the stored procedure
start, and you'll see each statement start, but they all take an extremely
long time. The process is just faster through QA.
 
Jon said:
Hi Frans, and thanks for the reply. My application currently maintains a
open connection throughout this process. It opens it when the process is
triggered, and closes it when the process is complete, due to the volume of
queries run in what is supposed to be a short time.

When it comes to query strings....my current string is as follows:

workstation ID=DEVELOPER1;packet size=4096;integrated security=SSPI;data
source=ARMDEV;persist security info=False;initial catalog=CertusARM

indeed a weird connection string. I don't think the workstation id is the
culpit here. (although you can't add it via QA).
I'm not sure if the workstation ID will cause any problems, but the
connection is set up through a component, and workstation ID gets added
regardless (pisses me off, too).

Do you get the connection string from an external component? You could try
to chop it off and use a better connection string. (but again, I don't think
it's it, as most processing is done on the server, and the server is slow as
it seems)
I have run some profiling on the server. Not much happens when I profile
the process as it runs from my application. You'll see the stored procedure
start, and you'll see each statement start, but they all take an extremely
long time. The process is just faster through QA.

Ok, one question remains: do you start a transaction in your process which
you don't start in the QA session?

Frans.
 
Frans, I do start a transaction in my code, actually. And I do not start any
transaction through QA. Can a transaction really cause this much of a
performance hit? Were talking anywhere from a 100% to 240% difference in
execution time between QA and the application.

I understand that transactions add overhead, but moving from approximately 1
minute time to process the whole thing on QA to 60-120 minutes through my
application sounds extreme.

I'm using the Sql data objects, rather than the OleDb objects. Would using
OleDb provide any improvement? (I kind of doubt it, since the Sql objects
are supposed to be optimized to directly access SQL Server, bypassing the
additional overhead of OleDb.)

I'm just floored by this...I've never encountered anything like it.
 
When a SP is executed, it uses a cached query plan. This plan is created the
first time the SP is executed. Subsequent executions might be called with
the same type of parameters so the plan is appropriate for the parameters.
However, if the parameters change, the optimal query plan that's generated
(on first call) can also change. If a subsequent execution would work better
with a different query plan, it won't run as quickly as it might if given a
custom query plan (as with forced recompile).

Yes, transaction enlistment does impact performance--sometimes quite
significantly.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

Jon Rista said:
Frans, I do start a transaction in my code, actually. And I do not start
any transaction through QA. Can a transaction really cause this much of a
performance hit? Were talking anywhere from a 100% to 240% difference in
execution time between QA and the application.

I understand that transactions add overhead, but moving from approximately
1 minute time to process the whole thing on QA to 60-120 minutes through
my application sounds extreme.

I'm using the Sql data objects, rather than the OleDb objects. Would using
OleDb provide any improvement? (I kind of doubt it, since the Sql objects
are supposed to be optimized to directly access SQL Server, bypassing the
additional overhead of OleDb.)

I'm just floored by this...I've never encountered anything like it.
 
I created another stored procedure that starts a transaction, executes the
other stored procedures, and commits the transaction. The processing time in
QA, when I run this sp, increased by 9 seconds, but that was all.

Is there any kind of optimization or configuration I can implement in my
application that might affect the speed of executing the sp's from ADO.NET?

Would ADO.NET cause SQL Server to execute a stored procedure that does not
return a result set to perform its operations differently than if the same
procedure was caled from QA with the same arguments?

If anyone can, I'd appreciate links to resources that might possibly hold
some answers. Thanks.

William (Bill) Vaughn said:
When a SP is executed, it uses a cached query plan. This plan is created
the first time the SP is executed. Subsequent executions might be called
with the same type of parameters so the plan is appropriate for the
parameters. However, if the parameters change, the optimal query plan
that's generated (on first call) can also change. If a subsequent
execution would work better with a different query plan, it won't run as
quickly as it might if given a custom query plan (as with forced
recompile).

Yes, transaction enlistment does impact performance--sometimes quite
significantly.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no
rights.
__________________________________
 
Eh, I recind my last statement. Adding the transaction to the QA sp causes
the processing time to increase astronomically. The first time I ran, I did
not have all the account and payment information loaded.

Are there any resources on the net about optimizing transacted processes in
SQL Server? Tips on what kinds of queries to transact and not transact?
 
It's not necessary to transact any statement that can be executed in
isolation. In SQL Server, there is an "implied" transaction that's handled
automatically by the server. Transactions are needed when you have two or
more changes to make that must be made together--and rolled back together if
either fails.
As far as performance, I would hit the SQL Server list... they might be able
to provide more insight.

hth

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
 
Well, the whole process, some 15-20 queries in about 10 stored procedures,
do need to be run as one contiguous process that gets rolled back if the
process failes. Hence the reason I was using transactions. I'll post the
question on the SQL Server list and see what kind of responses I get.
Thanks.
 
Your approach is resulting in lock escalation, and will lead to excessive
contention. As other applications/users access the same data you'll
eventually begin to see rollbacks due to deadlocks, especially if any code
utilizes distributed transactions at Read Serializable isolation.

Long running transactions generally don't work well in scalable systems.
Frequently you will achieve better overall performance and responsiveness by
breaking the work down into a larger number of smaller discrete
transactions. This is particularly true in cases where you cannot otherwise
reduce the excessive number of locks in your original code.
 
This particular process doesn't have any lock contention. Its a sequential
process, where the same set of queries are run one after another for one
input file after another. The only data that might be accessed by other
processes is data that is read from a live database into temporary tables
(accounts, payments, historical payments). The process runs in some 50-75
seconds without the transaction. I don't see how it could be lock issues
with the transaction in place, because only this one process is accessing
the tables. Unless the process is running into lock issues with itself.
 
Are you certain you don't have an excessive number of locks being generated?
Based on your symptoms I would look at locking as the first possible cause.
You can check this by monitoring statistics in perfmon.
 
Back
Top