Performance with large tables ADP vs. ODBC

  • Thread starter Thread starter John Kounis
  • Start date Start date
J

John Kounis

I recently converted from a .mdb file to an .adp frontend (via OLE DB
Provider) for a SQL database. In this database, I have a large table with
750,000 customer transactions where there's a debit/credit amount column and
a balance column. Every so often, the balance gets out of sync, so I run a
periodic maintenance program to correct the customer balances. It is
something like:

rstHistory.CacheSize = <some small number>
rst.Open " SELECT * FROM [TransactionTable with 750,000 rows] ORDER BY
TransactionID", _
CurrentProject.Connection, _
adOpenForwardOnly, adLockPessimistic, adAsyncFetch
While Not rst.EOF
[process a row]
Wend
rst.close

The process is order of magnitudes slower on the ADP database than it ever
was in the .mdb file. The problem seems to be that the "rst.Open" statement
is trying to process all 750,000 rows, which takes a lot of resources.

Since I've specified "adOpenForwardOnly" and a small cachesize, I thought
that this process would only process 1 row at a time. That's what it does in
the .MDB file, which is pretty fast. However, now, in the .ADP file, it
apparently needs to cache all 750,000 rows. It takes a long time to return,
and the bogs down the server incredibly (eventually, the server memory usage
exceeds the physical memory installed, and it starts swapping virtual memory
like crazy and slows to a crawl).

I've tried setting rst.CursorLocation to both adUseClient and adUseServer.
This just moves the problem to either the client or the server, but it
doesn't solve the problem. I've tried different combinations of asynchronous
and synchronous operations, too -- all with the same result.

Is there any way to just process 1 row at a time, or 100, or 1,000... just
not all 750,000 rows at once? My process goes sequentially through the rows,
so I don't need all at once.

Thanks,

John Kounis
 
Not knowing what the processing exactly is, it is hard to give opinion.
There are several posts in the past on the similar issue, in this NG and
others. One thing I suggest you to look into is to understand the difference
of SQL Server and file based Jet DB. When it is *.mdb, the processing
happens on the front end, whether the front end grabs all 750000 records at
once or not, each record is sent to front end, processed and then sent back.
When you switch to SQL Server, it is very possible to run your processing
(again, not knowing the processing, I can only say "very possible") on the
server, using stored procedure(s). So, your front end (ADP) only need to
send necessary parameters to the server, and let the server process the data
using SQL method, instead of one record at a time.

It is not uncommon that after converting MDB to ADP, the author may not put
enough thought on the difference of the two backend tpyes and the SQL
Server's power. The difference may be big enough to re-think the processing
logic of the front end.

John Kounis said:
I recently converted from a .mdb file to an .adp frontend (via OLE DB
Provider) for a SQL database. In this database, I have a large table with
750,000 customer transactions where there's a debit/credit amount column and
a balance column. Every so often, the balance gets out of sync, so I run a
periodic maintenance program to correct the customer balances. It is
something like:

rstHistory.CacheSize = <some small number>
rst.Open " SELECT * FROM [TransactionTable with 750,000 rows] ORDER BY
TransactionID", _
CurrentProject.Connection, _
adOpenForwardOnly, adLockPessimistic, adAsyncFetch
While Not rst.EOF
[process a row]
Wend
rst.close

The process is order of magnitudes slower on the ADP database than it ever
was in the .mdb file. The problem seems to be that the "rst.Open" statement
is trying to process all 750,000 rows, which takes a lot of resources.

Since I've specified "adOpenForwardOnly" and a small cachesize, I thought
that this process would only process 1 row at a time. That's what it does in
the .MDB file, which is pretty fast. However, now, in the .ADP file, it
apparently needs to cache all 750,000 rows. It takes a long time to return,
and the bogs down the server incredibly (eventually, the server memory usage
exceeds the physical memory installed, and it starts swapping virtual memory
like crazy and slows to a crawl).

I've tried setting rst.CursorLocation to both adUseClient and adUseServer.
This just moves the problem to either the client or the server, but it
doesn't solve the problem. I've tried different combinations of asynchronous
and synchronous operations, too -- all with the same result.

Is there any way to just process 1 row at a time, or 100, or 1,000... just
not all 750,000 rows at once? My process goes sequentially through the rows,
so I don't need all at once.

Thanks,

John Kounis
 
To paraphrase Norman's response, the performance benefit of a client/server
system is that processing can be done at the server on which the data resides.
When you write code that performs an operation on each record in a table, you
are creating a situation where client/server perfroms much worse, not better
than a non-C/S system.

If you can rework your code to use either a standard SQL UPDATE statement or a
stored procedure to do the processing, you should improve performance greatly.
Note that when you write the stored procedure, try to think in sets, and only
use cursors and record loops when there's no way to use set operations and the
occasional temporary table.
 
Thank you Norman and Steve for your comments,

I rewrote the code as a stored procedure and the performance is, of course,
superior. The code basically updates the customer balance column, it's
something like this in Visual Basic:

Balance = 0
While CustomerID = Current Customer
Balance = Balance + rst!amount
if rst!Balance <> Balance then
rst!Balance = Balance
rst.Update
end if
rst.MoveNext
wend

The above code is simplified a bit from the original, but you probably get
the idea. It is like a checkbook, where the balance is the sum of the
previous balance and the current amount.

Unfortunately, could not figure out a way to use sets, so I used cursors and
record loops contrary to Steve's advice. I just couldn't figure out how to
use an UPDATE statement to set [Customer Balance] = [Customer Balance of
Previous Transaction] + [Amount].

The stored procedure works very well of course. I was just hoping that I
would be able to slowly ease into using SQL server by using the upsizing
wizard, then rewriting Visual Basic code as stored procedures as workoad
permitted.

Since the database has about 100 forms, with 100 functions and about 700
subroutines, it's a daunting task to audit all the code and rewrite VB code
as stored procedures in many places. But I guess I'll just have to bite the
bullet, and do it in order to correctly make the switch to SQL Server.

Thanks again for your help,

John Kounis


Steve Jorgensen said:
To paraphrase Norman's response, the performance benefit of a client/server
system is that processing can be done at the server on which the data resides.
When you write code that performs an operation on each record in a table, you
are creating a situation where client/server perfroms much worse, not better
than a non-C/S system.

If you can rework your code to use either a standard SQL UPDATE statement or a
stored procedure to do the processing, you should improve performance greatly.
Note that when you write the stored procedure, try to think in sets, and only
use cursors and record loops when there's no way to use set operations and the
occasional temporary table.

I recently converted from a .mdb file to an .adp frontend (via OLE DB
Provider) for a SQL database. In this database, I have a large table with
750,000 customer transactions where there's a debit/credit amount column and
a balance column. Every so often, the balance gets out of sync, so I run a
periodic maintenance program to correct the customer balances. It is
something like:

rstHistory.CacheSize = <some small number>
rst.Open " SELECT * FROM [TransactionTable with 750,000 rows] ORDER BY
TransactionID", _
CurrentProject.Connection, _
adOpenForwardOnly, adLockPessimistic, adAsyncFetch
While Not rst.EOF
[process a row]
Wend
rst.close

The process is order of magnitudes slower on the ADP database than it ever
was in the .mdb file. The problem seems to be that the "rst.Open" statement
is trying to process all 750,000 rows, which takes a lot of resources.

Since I've specified "adOpenForwardOnly" and a small cachesize, I thought
that this process would only process 1 row at a time. That's what it does in
the .MDB file, which is pretty fast. However, now, in the .ADP file, it
apparently needs to cache all 750,000 rows. It takes a long time to return,
and the bogs down the server incredibly (eventually, the server memory usage
exceeds the physical memory installed, and it starts swapping virtual memory
like crazy and slows to a crawl).

I've tried setting rst.CursorLocation to both adUseClient and adUseServer.
This just moves the problem to either the client or the server, but it
doesn't solve the problem. I've tried different combinations of asynchronous
and synchronous operations, too -- all with the same result.

Is there any way to just process 1 row at a time, or 100, or 1,000... just
not all 750,000 rows at once? My process goes sequentially through the rows,
so I don't need all at once.

Thanks,

John Kounis
 
Look into fetch.
it is slower but allows row by row evaluation.
Also use local vars
Declare
@lastbalance int

John Kounis said:
Thank you Norman and Steve for your comments,

I rewrote the code as a stored procedure and the performance is, of course,
superior. The code basically updates the customer balance column, it's
something like this in Visual Basic:

Balance = 0
While CustomerID = Current Customer
Balance = Balance + rst!amount
if rst!Balance <> Balance then
rst!Balance = Balance
rst.Update
end if
rst.MoveNext
wend

The above code is simplified a bit from the original, but you probably get
the idea. It is like a checkbook, where the balance is the sum of the
previous balance and the current amount.

Unfortunately, could not figure out a way to use sets, so I used cursors and
record loops contrary to Steve's advice. I just couldn't figure out how to
use an UPDATE statement to set [Customer Balance] = [Customer Balance of
Previous Transaction] + [Amount].

The stored procedure works very well of course. I was just hoping that I
would be able to slowly ease into using SQL server by using the upsizing
wizard, then rewriting Visual Basic code as stored procedures as workoad
permitted.

Since the database has about 100 forms, with 100 functions and about 700
subroutines, it's a daunting task to audit all the code and rewrite VB code
as stored procedures in many places. But I guess I'll just have to bite the
bullet, and do it in order to correctly make the switch to SQL Server.

Thanks again for your help,

John Kounis


Steve Jorgensen said:
To paraphrase Norman's response, the performance benefit of a client/server
system is that processing can be done at the server on which the data resides.
When you write code that performs an operation on each record in a
table,
you
are creating a situation where client/server perfroms much worse, not better
than a non-C/S system.

If you can rework your code to use either a standard SQL UPDATE
statement
or a
stored procedure to do the processing, you should improve performance greatly.
Note that when you write the stored procedure, try to think in sets, and only
use cursors and record loops when there's no way to use set operations
and
the
occasional temporary table.
column
run
a
periodic maintenance program to correct the customer balances. It is
something like:

rstHistory.CacheSize = <some small number>
rst.Open " SELECT * FROM [TransactionTable with 750,000 rows] ORDER BY
TransactionID", _
CurrentProject.Connection, _
adOpenForwardOnly, adLockPessimistic, adAsyncFetch
While Not rst.EOF
[process a row]
Wend
rst.close

The process is order of magnitudes slower on the ADP database than it ever
was in the .mdb file. The problem seems to be that the "rst.Open" statement
is trying to process all 750,000 rows, which takes a lot of resources.

Since I've specified "adOpenForwardOnly" and a small cachesize, I thought
that this process would only process 1 row at a time. That's what it
does
 
Thank you BJ,

I wrote a stored procedure using a cursor and fetch to perform row-by-row
processing and it's much faster than the old VB code.

John Kounis




BJ Freeman said:
Look into fetch.
it is slower but allows row by row evaluation.
Also use local vars
Declare
@lastbalance int

John Kounis said:
Thank you Norman and Steve for your comments,

I rewrote the code as a stored procedure and the performance is, of course,
superior. The code basically updates the customer balance column, it's
something like this in Visual Basic:

Balance = 0
While CustomerID = Current Customer
Balance = Balance + rst!amount
if rst!Balance <> Balance then
rst!Balance = Balance
rst.Update
end if
rst.MoveNext
wend

The above code is simplified a bit from the original, but you probably get
the idea. It is like a checkbook, where the balance is the sum of the
previous balance and the current amount.

Unfortunately, could not figure out a way to use sets, so I used cursors and
record loops contrary to Steve's advice. I just couldn't figure out how to
use an UPDATE statement to set [Customer Balance] = [Customer Balance of
Previous Transaction] + [Amount].

The stored procedure works very well of course. I was just hoping that I
would be able to slowly ease into using SQL server by using the upsizing
wizard, then rewriting Visual Basic code as stored procedures as workoad
permitted.

Since the database has about 100 forms, with 100 functions and about 700
subroutines, it's a daunting task to audit all the code and rewrite VB code
as stored procedures in many places. But I guess I'll just have to bite the
bullet, and do it in order to correctly make the switch to SQL Server.

Thanks again for your help,

John Kounis


Steve Jorgensen said:
To paraphrase Norman's response, the performance benefit of a client/server
system is that processing can be done at the server on which the data resides.
When you write code that performs an operation on each record in a
table,
you
are creating a situation where client/server perfroms much worse, not better
than a non-C/S system.

If you can rework your code to use either a standard SQL UPDATE
statement
or a
stored procedure to do the processing, you should improve performance greatly.
Note that when you write the stored procedure, try to think in sets,
and
only
use cursors and record loops when there's no way to use set operations
and
the
occasional temporary table.
wrote:

I recently converted from a .mdb file to an .adp frontend (via OLE DB
Provider) for a SQL database. In this database, I have a large table with
750,000 customer transactions where there's a debit/credit amount
column
and
a balance column. Every so often, the balance gets out of sync, so I
run
a
periodic maintenance program to correct the customer balances. It is
something like:

rstHistory.CacheSize = <some small number>
rst.Open " SELECT * FROM [TransactionTable with 750,000 rows] ORDER BY
TransactionID", _
CurrentProject.Connection, _
adOpenForwardOnly, adLockPessimistic, adAsyncFetch
While Not rst.EOF
[process a row]
Wend
rst.close

The process is order of magnitudes slower on the ADP database than it ever
was in the .mdb file. The problem seems to be that the "rst.Open" statement
is trying to process all 750,000 rows, which takes a lot of resources.

Since I've specified "adOpenForwardOnly" and a small cachesize, I thought
that this process would only process 1 row at a time. That's what it
does
in
the .MDB file, which is pretty fast. However, now, in the .ADP file, it
apparently needs to cache all 750,000 rows. It takes a long time to return,
and the bogs down the server incredibly (eventually, the server
memory
usage
exceeds the physical memory installed, and it starts swapping virtual memory
like crazy and slows to a crawl).

I've tried setting rst.CursorLocation to both adUseClient and adUseServer.
This just moves the problem to either the client or the server, but it
doesn't solve the problem. I've tried different combinations of asynchronous
and synchronous operations, too -- all with the same result.

Is there any way to just process 1 row at a time, or 100, or 1,000... just
not all 750,000 rows at once? My process goes sequentially through
the
rows,
so I don't need all at once.

Thanks,

John Kounis
 
Back
Top