SqlDataAdapter.Update()

  • Thread starter Thread starter Mike Labosh
  • Start date Start date
M

Mike Labosh

I'm not asking a question here, I'm sharing some insight I just discovered.
Consider the following [air] code. My actual code operates on a huge table
with over a million and a half records, so I have to loop in blocks of
100,000 at a time, and the "processing" that it does to the data is several
pages long, not a silly += 1 that I did in this post.

Dim cnstr As String = "connect to SQL Northwind"
Dim cn As New SqlConnection(cnstr)
Dim da As New SqlDataAdapter()
Dim dt As New DataTable()
Dim sel As String = "SELECT * FROM [Order Details]"
Dim upd As String = _
"UPDATE [Order Details] SET Quantity = @quantity " & _
"WHERE OrderID = @orderID AND ProductID = @productID"

With da 'Setup a data adapter to select and update
.SelectCommand = New SqlCommand(sel, cn)
.UpdateCommand = New SqlCommand(upd, cn)

With .UpdateCommand.Parameters
.Add("@quantity", SqlDbType.Int)
.Add("@orderID", SqlDbType.Int)
.Add("@productID", SqlDbType.Int)
End With

.Fill(dt) 'Get some rows

'Do some kind of processing
For Each dr As DataRow In dt.Rows
'I know this is the dumb way to update, but this
'isn't what I'm getting at.
dr("Quantity") = CType(dr("Quantity"), Integer) + 1
Next

'Update them.
.Update(dt)

End With

Now, all the microsoft documentation I have ever read about the data adapter
vaguely implies that it is really really fast, and I have always understood
that the data adapter did something like this inside the update method to
gain performance through fewer round trips to the server:

Dim bigFatCommand as String

For i = 0 to CountOfChanges
bigFatCommand &= vbCrLf & "update table set fld = 'asdf'...;"
Next i

cn.Open()
sqlCmd.CommandText = bigFatCommand
sqlCmd.ExecuteNonQuery
cn.Close()

What I have discovered today is that's not how it works at all. You can
verify this with the SQL Profilier. Startup a session in the profiler and
have it watch for RPC:Completed events. Then run code similar to that at
the top of this post. When you run the VB in the debugger, you can pause
the debugger and all activity with SQL Server in the Profiler stops.
Unpause it, and there is activity again. Pause it and the activity stops.
This demonstrates that the Data Adapter is actually sending one command for
each individual updated record, like this:

cn.Open()

For i = 0 to CountOfChanges
sqlCmd.CommandText = "update table set fld = ..."
sqlCmd.ExecuteNonQuery()
Next i

cn.Close()

Note that the Profiler demonstrates that the data adapter is not opening and
closing the connection inside the loop, but DANG it sure makes a lot of
calls across the network!

In conclusion, I have decided that gigantor batch processing needs a more
clever solution than the DataAdapter. I am going to use a StringBuilder to
build huge stacks of commands, and send just one big fat string of updates
in a single round trip.
--
Peace & happy computing,

Mike Labosh, MCSD

"It's 4:30 am. Do you know where your stack pointer is?"
 
Well, it's understandable how you would come to the conclusion that ADO.NET
would do batch updates because its predecessor COM-based ADO did.
However, I don't think you're approaching this problem in the most efficient
manner. Even building a bunch of UPDATE statements is going to be pretty
slow compared to executing an intelligent UPDATE on the server. The only
question I have is where do the new @Quantity values come from? If you write
server-side UPDATE that performed this logic it would run an order of
magnitude faster as the data would not have to be transported to the client,
changed and sent back--it could be changed in place.
One approach might be to bulk copy the changes to the server with the PK and
the delta value into a temporary table and do the smart UPDATE joining
against that table. There are other ways as well...

hth

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

Mike Labosh said:
I'm not asking a question here, I'm sharing some insight I just
discovered. Consider the following [air] code. My actual code operates on
a huge table with over a million and a half records, so I have to loop in
blocks of 100,000 at a time, and the "processing" that it does to the data
is several pages long, not a silly += 1 that I did in this post.

Dim cnstr As String = "connect to SQL Northwind"
Dim cn As New SqlConnection(cnstr)
Dim da As New SqlDataAdapter()
Dim dt As New DataTable()
Dim sel As String = "SELECT * FROM [Order Details]"
Dim upd As String = _
"UPDATE [Order Details] SET Quantity = @quantity " & _
"WHERE OrderID = @orderID AND ProductID = @productID"

With da 'Setup a data adapter to select and update
.SelectCommand = New SqlCommand(sel, cn)
.UpdateCommand = New SqlCommand(upd, cn)

With .UpdateCommand.Parameters
.Add("@quantity", SqlDbType.Int)
.Add("@orderID", SqlDbType.Int)
.Add("@productID", SqlDbType.Int)
End With

.Fill(dt) 'Get some rows

'Do some kind of processing
For Each dr As DataRow In dt.Rows
'I know this is the dumb way to update, but this
'isn't what I'm getting at.
dr("Quantity") = CType(dr("Quantity"), Integer) + 1
Next

'Update them.
.Update(dt)

End With

Now, all the microsoft documentation I have ever read about the data
adapter vaguely implies that it is really really fast, and I have always
understood that the data adapter did something like this inside the update
method to gain performance through fewer round trips to the server:

Dim bigFatCommand as String

For i = 0 to CountOfChanges
bigFatCommand &= vbCrLf & "update table set fld = 'asdf'...;"
Next i

cn.Open()
sqlCmd.CommandText = bigFatCommand
sqlCmd.ExecuteNonQuery
cn.Close()

What I have discovered today is that's not how it works at all. You can
verify this with the SQL Profilier. Startup a session in the profiler and
have it watch for RPC:Completed events. Then run code similar to that at
the top of this post. When you run the VB in the debugger, you can pause
the debugger and all activity with SQL Server in the Profiler stops.
Unpause it, and there is activity again. Pause it and the activity stops.
This demonstrates that the Data Adapter is actually sending one command
for each individual updated record, like this:

cn.Open()

For i = 0 to CountOfChanges
sqlCmd.CommandText = "update table set fld = ..."
sqlCmd.ExecuteNonQuery()
Next i

cn.Close()

Note that the Profiler demonstrates that the data adapter is not opening
and closing the connection inside the loop, but DANG it sure makes a lot
of calls across the network!

In conclusion, I have decided that gigantor batch processing needs a more
clever solution than the DataAdapter. I am going to use a StringBuilder
to build huge stacks of commands, and send just one big fat string of
updates in a single round trip.
--
Peace & happy computing,

Mike Labosh, MCSD

"It's 4:30 am. Do you know where your stack pointer is?"
 
However, I don't think you're approaching this problem in the most
efficient manner. Even building a bunch of UPDATE statements is going to
be pretty slow compared to executing an intelligent UPDATE on the server.
The only question I have is where do the new @Quantity values come from?

Well since I'm working on a real system and not Northwind, there are no
quantity values :)
If you write server-side UPDATE that performed this logic it would run an
order of magnitude faster as the data would not have to be transported to
the client, changed and sent back--it could be changed in place.

I agree. I could do this as a stored procedure, but this giant
string-chewing algorithm wouldn't work well in SQL. There is a lot of
looping over words in the string and parsing them off, which in SQL would
likely need evil cursors. There is also 5 or 6 rather bizarre regular
expressions used, and in SQL, that would mean using the sp_OA* procs to use
the Scripting.RegExp object.
One approach might be to bulk copy the changes to the server with the PK
and the delta value into a temporary table and do the smart UPDATE joining
against that table. There are other ways as well...

I actually thought of designing it that way, but there's not really a
"delta". Each value is updated differently based on its content. It
wouldn't be like marking up a price column by n% or like Trimming.

Maybe tonight I'll try a thread-based solution, where one thread can read
from a DataReader, marshal it to another thread that does the big fat
algorithm, and marshal the result to another thread that passes it to a
stored procedure. Maybethey could share a pair of synchronized Queue
objects. Of course, now that I just read what my hands typed on autopilot,
this thread idea probably means it's time to knock off for the day.
--
Peace & happy computing,

Mike Labosh, MCSD

"It's 4:30 am. Do you know where your stack pointer is?"
William (Bill) Vaughn said:
hth

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no
rights.
__________________________________

Mike Labosh said:
I'm not asking a question here, I'm sharing some insight I just
discovered. Consider the following [air] code. My actual code operates
on a huge table with over a million and a half records, so I have to loop
in blocks of 100,000 at a time, and the "processing" that it does to the
data is several pages long, not a silly += 1 that I did in this post.

Dim cnstr As String = "connect to SQL Northwind"
Dim cn As New SqlConnection(cnstr)
Dim da As New SqlDataAdapter()
Dim dt As New DataTable()
Dim sel As String = "SELECT * FROM [Order Details]"
Dim upd As String = _
"UPDATE [Order Details] SET Quantity = @quantity " & _
"WHERE OrderID = @orderID AND ProductID = @productID"

With da 'Setup a data adapter to select and update
.SelectCommand = New SqlCommand(sel, cn)
.UpdateCommand = New SqlCommand(upd, cn)

With .UpdateCommand.Parameters
.Add("@quantity", SqlDbType.Int)
.Add("@orderID", SqlDbType.Int)
.Add("@productID", SqlDbType.Int)
End With

.Fill(dt) 'Get some rows

'Do some kind of processing
For Each dr As DataRow In dt.Rows
'I know this is the dumb way to update, but this
'isn't what I'm getting at.
dr("Quantity") = CType(dr("Quantity"), Integer) + 1
Next

'Update them.
.Update(dt)

End With

Now, all the microsoft documentation I have ever read about the data
adapter vaguely implies that it is really really fast, and I have always
understood that the data adapter did something like this inside the
update method to gain performance through fewer round trips to the
server:

Dim bigFatCommand as String

For i = 0 to CountOfChanges
bigFatCommand &= vbCrLf & "update table set fld = 'asdf'...;"
Next i

cn.Open()
sqlCmd.CommandText = bigFatCommand
sqlCmd.ExecuteNonQuery
cn.Close()

What I have discovered today is that's not how it works at all. You can
verify this with the SQL Profilier. Startup a session in the profiler
and have it watch for RPC:Completed events. Then run code similar to
that at the top of this post. When you run the VB in the debugger, you
can pause the debugger and all activity with SQL Server in the Profiler
stops. Unpause it, and there is activity again. Pause it and the
activity stops. This demonstrates that the Data Adapter is actually
sending one command for each individual updated record, like this:

cn.Open()

For i = 0 to CountOfChanges
sqlCmd.CommandText = "update table set fld = ..."
sqlCmd.ExecuteNonQuery()
Next i

cn.Close()

Note that the Profiler demonstrates that the data adapter is not opening
and closing the connection inside the loop, but DANG it sure makes a lot
of calls across the network!

In conclusion, I have decided that gigantor batch processing needs a more
clever solution than the DataAdapter. I am going to use a StringBuilder
to build huge stacks of commands, and send just one big fat string of
updates in a single round trip.
--
Peace & happy computing,

Mike Labosh, MCSD

"It's 4:30 am. Do you know where your stack pointer is?"
 
The V2.0 (Beta1) DataSet does batched updates ... and there is a BulkCopy
class to do managed code bulk copies as well.

Mike Labosh said:
I'm not asking a question here, I'm sharing some insight I just discovered.
Consider the following [air] code. My actual code operates on a huge table
with over a million and a half records, so I have to loop in blocks of
100,000 at a time, and the "processing" that it does to the data is several
pages long, not a silly += 1 that I did in this post.

Dim cnstr As String = "connect to SQL Northwind"
Dim cn As New SqlConnection(cnstr)
Dim da As New SqlDataAdapter()
Dim dt As New DataTable()
Dim sel As String = "SELECT * FROM [Order Details]"
Dim upd As String = _
"UPDATE [Order Details] SET Quantity = @quantity " & _
"WHERE OrderID = @orderID AND ProductID = @productID"

With da 'Setup a data adapter to select and update
.SelectCommand = New SqlCommand(sel, cn)
.UpdateCommand = New SqlCommand(upd, cn)

With .UpdateCommand.Parameters
.Add("@quantity", SqlDbType.Int)
.Add("@orderID", SqlDbType.Int)
.Add("@productID", SqlDbType.Int)
End With

.Fill(dt) 'Get some rows

'Do some kind of processing
For Each dr As DataRow In dt.Rows
'I know this is the dumb way to update, but this
'isn't what I'm getting at.
dr("Quantity") = CType(dr("Quantity"), Integer) + 1
Next

'Update them.
.Update(dt)

End With

Now, all the microsoft documentation I have ever read about the data adapter
vaguely implies that it is really really fast, and I have always understood
that the data adapter did something like this inside the update method to
gain performance through fewer round trips to the server:

Dim bigFatCommand as String

For i = 0 to CountOfChanges
bigFatCommand &= vbCrLf & "update table set fld = 'asdf'...;"
Next i

cn.Open()
sqlCmd.CommandText = bigFatCommand
sqlCmd.ExecuteNonQuery
cn.Close()

What I have discovered today is that's not how it works at all. You can
verify this with the SQL Profilier. Startup a session in the profiler and
have it watch for RPC:Completed events. Then run code similar to that at
the top of this post. When you run the VB in the debugger, you can pause
the debugger and all activity with SQL Server in the Profiler stops.
Unpause it, and there is activity again. Pause it and the activity stops.
This demonstrates that the Data Adapter is actually sending one command for
each individual updated record, like this:

cn.Open()

For i = 0 to CountOfChanges
sqlCmd.CommandText = "update table set fld = ..."
sqlCmd.ExecuteNonQuery()
Next i

cn.Close()

Note that the Profiler demonstrates that the data adapter is not opening and
closing the connection inside the loop, but DANG it sure makes a lot of
calls across the network!

In conclusion, I have decided that gigantor batch processing needs a more
clever solution than the DataAdapter. I am going to use a StringBuilder to
build huge stacks of commands, and send just one big fat string of updates
in a single round trip.
--
Peace & happy computing,

Mike Labosh, MCSD

"It's 4:30 am. Do you know where your stack pointer is?"
 
Mike,
You are completely correct, adapter updates are done with a roundtrip per
row by design. You will be much better off generating a batched update
manually. Here are some thoughts. You may think that the bigger the batch
the more performant your update, there is actually a sweet spot between cost
of generating the batch and the number of roundtrips you save. Manually
created batches _may_ be limited in batch size or number of parameters by
the backend. For Sql Server you are limited to 250mb text size and arround
1000parameters.

In ado.net 2.0 we are adding a "UpdateBatchSize" property to the adapter
that you can use to specify the number of rows you want to send per
roundtrip. The idea is that we will batch these many rows for you under the
covers to improve the performance. For scenarios where you just want to
insert data into SqlServer we have added a SqlBulkCopy class that makes it
easy for you to do bulk copy operations from managed code.

--
Angel Saenz-Badillos [MS] Managed Providers
This posting is provided "AS IS", with no warranties, and confers no
rights.Please do not send email directly to this alias.
This alias is for newsgroup purposes only.
I am now blogging about ADO.NET: http://weblogs.asp.net/angelsb/




Mike Labosh said:
I'm not asking a question here, I'm sharing some insight I just discovered.
Consider the following [air] code. My actual code operates on a huge table
with over a million and a half records, so I have to loop in blocks of
100,000 at a time, and the "processing" that it does to the data is several
pages long, not a silly += 1 that I did in this post.

Dim cnstr As String = "connect to SQL Northwind"
Dim cn As New SqlConnection(cnstr)
Dim da As New SqlDataAdapter()
Dim dt As New DataTable()
Dim sel As String = "SELECT * FROM [Order Details]"
Dim upd As String = _
"UPDATE [Order Details] SET Quantity = @quantity " & _
"WHERE OrderID = @orderID AND ProductID = @productID"

With da 'Setup a data adapter to select and update
.SelectCommand = New SqlCommand(sel, cn)
.UpdateCommand = New SqlCommand(upd, cn)

With .UpdateCommand.Parameters
.Add("@quantity", SqlDbType.Int)
.Add("@orderID", SqlDbType.Int)
.Add("@productID", SqlDbType.Int)
End With

.Fill(dt) 'Get some rows

'Do some kind of processing
For Each dr As DataRow In dt.Rows
'I know this is the dumb way to update, but this
'isn't what I'm getting at.
dr("Quantity") = CType(dr("Quantity"), Integer) + 1
Next

'Update them.
.Update(dt)

End With

Now, all the microsoft documentation I have ever read about the data adapter
vaguely implies that it is really really fast, and I have always understood
that the data adapter did something like this inside the update method to
gain performance through fewer round trips to the server:

Dim bigFatCommand as String

For i = 0 to CountOfChanges
bigFatCommand &= vbCrLf & "update table set fld = 'asdf'...;"
Next i

cn.Open()
sqlCmd.CommandText = bigFatCommand
sqlCmd.ExecuteNonQuery
cn.Close()

What I have discovered today is that's not how it works at all. You can
verify this with the SQL Profilier. Startup a session in the profiler and
have it watch for RPC:Completed events. Then run code similar to that at
the top of this post. When you run the VB in the debugger, you can pause
the debugger and all activity with SQL Server in the Profiler stops.
Unpause it, and there is activity again. Pause it and the activity stops.
This demonstrates that the Data Adapter is actually sending one command for
each individual updated record, like this:

cn.Open()

For i = 0 to CountOfChanges
sqlCmd.CommandText = "update table set fld = ..."
sqlCmd.ExecuteNonQuery()
Next i

cn.Close()

Note that the Profiler demonstrates that the data adapter is not opening and
closing the connection inside the loop, but DANG it sure makes a lot of
calls across the network!

In conclusion, I have decided that gigantor batch processing needs a more
clever solution than the DataAdapter. I am going to use a StringBuilder to
build huge stacks of commands, and send just one big fat string of updates
in a single round trip.
--
Peace & happy computing,

Mike Labosh, MCSD

"It's 4:30 am. Do you know where your stack pointer is?"
 
Now, all the microsoft documentation I have ever read about the data adapter
vaguely implies that it is really really fast, and I have always understood
that the data adapter did something like this inside the update method to
gain performance through fewer round trips to the server:

<snip>

In that case, I suggest you read the DataAdapter.Update documentation
again:

<quote>
It should be noted that these statements are not performed as a batch
process; each row is updated individually.
</quote>
 
AWESOME NEWS!!

--
Peace & happy computing,

Mike Labosh, MCSD

"It's 4:30 am. Do you know where your stack pointer is?"
 
Back
Top