MultiThreading for Multiple Inserts

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hello,

We have a vb.net dll which secuentially calls a method which inserts a record in our database.

Our user fills up a grid with 15 records, we create an in memory xml which is passed to business tier. The business tier iterates through this xml calling a method in another dll which verifies some info and the inserts the record in the database.

The overall process takes around 40 seconds. We are trying to implement a solution which uses multitheading to execute the inserts in multiple threads. This way we think we can gain some tieme in the overall solution.

We would really appreciate if somebody could give us a link or example to accomplish this.


Here is our example of what we are trying to accomplish:

BEGIN CODE:

Public Class clsThread
Public myThread As System.Threading.Thread
Public myvalue As String

Sub startMultiThreadInsert(ByVal array As String())

Dim i As Integer

For i = 0 To array.Length - 1
myvalue = array(i).ToString
myThread = New System.Threading.Thread(AddressOf Insert)
Next

End Sub

Sub Insert()
clsBusiness.Insert(myvalue)
End Sub

END CODE:

We are not really sure if this is the best approach to do it , or if there is another way.

- AddressOff doesn´t accept parameters, we are using a global varaiable as a container for the value.

- I think we have a synch problem with the variable myvalue.

- We also would need some sort of notification when the threads are done doing the inserts, so we can know when everything has finished.

Many Thanks,
SAM
 
"(e-mail address removed)"
Hello,

We have a vb.net dll which secuentially calls a method which inserts a record in our database.

Our user fills up a grid with 15 records, we create an in memory xml
which is passed to business tier. The business tier iterates through this
xml calling a method in another dll which verifies some info and the inserts
the record in the database.
The overall process takes around 40 seconds. We are trying to implement a
solution which uses multitheading to execute the inserts in multiple
threads. This way we think we can gain some tieme in the overall solution.

MultiThreading is unlikely to help you here. Your problem isn't that you're
using up 100% of a single CPU, so having another thread run on a different
CPU won't help you. MultiThreading does little for data-bound code.
 
Dear John,

Thanks for your answer, but wouldn´t there be some gain in using multithreading.

Our code right now is secuential:

It loops for every record, validating data against the database. Before the insert we are callling 3 storeprocedures which validate that the data is correct, and after that the insert is done.

The overall time for each record is around 4-5 seconds, so if if we have 15 records, it takes 60 seconds for the whole operation to take place.

We think that if we use multithreading we can paralel the execution of the method, thus saving some time since we don´t have to wait for each operation to take place and finish.

Regards,
Sam
 
Sam,

you definitely will save some time with threads, but do not expect much
savings. You might gain much more if you will do validation before going
into this loop, which - validation - seems to take most of time. And you
might gain some more if validation will call one sp instead of 3 - create a
wrapper sp.

However, hard facts are more preferable here. Did you profile your sps and
inserts? Maybe you need to optimize stored procs before going into threads?

HTH
Alex

"(e-mail address removed)"
Dear John,

Thanks for your answer, but wouldn?t there be some gain in using multithreading.

Our code right now is secuential:

It loops for every record, validating data against the database. Before
the insert we are callling 3 storeprocedures which validate that the data is
correct, and after that the insert is done.
The overall time for each record is around 4-5 seconds, so if if we have
15 records, it takes 60 seconds for the whole operation to take place.
We think that if we use multithreading we can paralel the execution of the
method, thus saving some time since we don?t have to wait for each operation
to take place and finish.
 
I am not sure that Threading is really called for as there is overhead in thread context switching.

1. Does clsBusiness need to return any error info to the caller?

if not try this. . . C# code cause VB sux bigtime

public delegate void InsertDelegate (string AValue);

public class clsBusiness
{
void InsertD(string AValue);
{
// Your insert Code here

}
}

public class AServer
{
private InsertDelegate m_ID;
private clsBusinessClass m_Inserter;

public AServer()
{
private m_ID;
m_Inserter = new clsBusinessClass();
m_ID = InsertDelegate(m_Inserter.Insert);
}
public DoInsert(string AValue)
{
m_ID.BeginInvoke(AValue, null, null);
}
}

instance a Server and in your loop call
AServer.DoInsert([your value here])
 
you definitely will save some time with threads, but do not expect much

I'm not sure why you think this. A properly implemented thread solution can
cut the time considerably at the expense of other resources ofcourse. OP is
right, the parallel execution will help considerably and it will be roughly
equal to the number of inserts per stack of queries plus a delta for
overhead related to context switching.

I have used this successfully where upwards of 4000 queries needed to be
executed. The trick was to set a bound on the max number of threads being
spawned (thru trial and error, my came out to be 7 threads). The stack of
queries was divided by the number of threads with the leftover going to an
arbitrary thread. Each thread blasted its stack to the database
concurrently. My tests showed that 175 inserts or 300 updates could occur
per second on an informix engine without using stored procs. This was used
to set the maximum stack of queries a thread could use. In addition, the
application had to run on its own server because this approach is extremely
cpu intensive (quad box). The database had to be tuned to wait on locked
rows - it's not typically the default. A trivial example for 5000 queries
would move 1000 queries per thread on a 5 thread max solution, which would
take about 4 seconds. Sequentially, the queries would be done in about 1/2 a
minute, using the threading approach it was around 5 seconds - a
considerable improvement.

I wouldn't suggest this approach unless you knew exactly what you were doing
and had the gall to implement it correctly. I recall having to *beg Jon
Skeet to help me fix some nasty thread bugs that no one else seemed to know
how to fix...Thanks Jon.
 
Alvin,

here my point was underlying database engine. Biggest delays are caused by
disk access, used queries and underlying implementation of query processing.
You have some figures for Informix, which seem to be fine or impressive for
you. However you mention quad - 4 CPUs? And I think you did single-row
inserts and updates, when indexes are very efficient, right?
On standard single-CPU desktop engine like MS SQL / MSDE can do work faster
with threaded queries, but in certain cases query could be so processor/disk
heavy that multi-threaded version will behave nearly like single-threaded
one - initiated query must be completed before next one is able to run. It
could be related to locking behavior, non-deterministic deadlocking, volume
of data referenced, availability of indexes or something else. That's main
reason why I say that gains could be not very significant, while not
negating fact that in some cases you might have impressive results.
Generally speaking it is very application dependent topic. I expect you
would agree that to thread inefficient queries is inefficient practice,
where you cannot expect significant gains. Did I mention profilers in my
original post?

Anyway, after all this reasoning I would go for multi-threaded solution.
Single-threaded DB access is the lest efficient solution which one can
choose.

Alex

Alvin Bruney said:
you definitely will save some time with threads, but do not expect much
savings.

I'm not sure why you think this. A properly implemented thread solution can
cut the time considerably at the expense of other resources ofcourse. OP is
right, the parallel execution will help considerably and it will be roughly
equal to the number of inserts per stack of queries plus a delta for
overhead related to context switching.

I have used this successfully where upwards of 4000 queries needed to be
executed. The trick was to set a bound on the max number of threads being
spawned (thru trial and error, my came out to be 7 threads). The stack of
queries was divided by the number of threads with the leftover going to an
arbitrary thread. Each thread blasted its stack to the database
concurrently. My tests showed that 175 inserts or 300 updates could occur
per second on an informix engine without using stored procs. This was used
to set the maximum stack of queries a thread could use. In addition, the
application had to run on its own server because this approach is extremely
cpu intensive (quad box). The database had to be tuned to wait on locked
rows - it's not typically the default. A trivial example for 5000 queries
would move 1000 queries per thread on a 5 thread max solution, which would
take about 4 seconds. Sequentially, the queries would be done in about 1/2 a
minute, using the threading approach it was around 5 seconds - a
considerable improvement.

I wouldn't suggest this approach unless you knew exactly what you were doing
and had the gall to implement it correctly. I recall having to *beg Jon
Skeet to help me fix some nasty thread bugs that no one else seemed to know
how to fix...Thanks Jon.

--
Regards,
Alvin Bruney
[ASP.NET MVP http://mvp.support.microsoft.com/default.aspx]
Got tidbits? Get it here... http://tinyurl.com/27cok
AlexS said:
Sam,

you definitely will save some time with threads, but do not expect much
savings. You might gain much more if you will do validation before going
into this loop, which - validation - seems to take most of time. And you
might gain some more if validation will call one sp instead of 3 - create
a
wrapper sp.

However, hard facts are more preferable here. Did you profile your sps and
inserts? Maybe you need to optimize stored procs before going into
threads?

HTH
Alex

"(e-mail address removed)"

the insert we are callling 3 storeprocedures which validate that the data
is
correct, and after that the insert is done. have
15 records, it takes 60 seconds for the whole operation to take place.
method, thus saving some time since we don?t have to wait for each
operation
to take place and finish.
 
I am not sure that Threading is really called for as there is overhead in thread context switching.

1. Does clsBusiness need to return any error info to the caller?

if not try this. . . C# code cause VB sux bigtime

public delegate void InsertDelegate (string AValue);

public class clsBusiness
{
void InsertD(string AValue);
{
// Your insert Code here

}
}

public class AServer
{
private InsertDelegate m_ID;
private clsBusinessClass m_Inserter;

public AServer()
{
private m_ID;
m_Inserter = new clsBusinessClass();
// here is the fix
m_ID = new InsertDelegate(m_Inserter.Insert);
}

public DoInsert(string AValue)
{
m_ID.BeginInvoke(AValue, null, null);
}
}

instance a Server and in your loop call
AServer.DoInsert([your value here])
 
true.

i'd sit down and think about this or any other performance improvement
solution thoroughly before proceeding. For example, certain databases like
mysql cannot lock at the row level which makes threading pointless. Other
databases lock at the page level instead of the row level etc etc. Going
after performance involves understanding the underlying architecture
intimately. It's easy to put together an application that works, building
performance into an application takes expert knowledge.

--
Regards,
Alvin Bruney
[ASP.NET MVP http://mvp.support.microsoft.com/default.aspx]
Got tidbits? Get it here... http://tinyurl.com/27cok
AlexS said:
Alvin,

here my point was underlying database engine. Biggest delays are caused by
disk access, used queries and underlying implementation of query
processing.
You have some figures for Informix, which seem to be fine or impressive
for
you. However you mention quad - 4 CPUs? And I think you did single-row
inserts and updates, when indexes are very efficient, right?
On standard single-CPU desktop engine like MS SQL / MSDE can do work
faster
with threaded queries, but in certain cases query could be so
processor/disk
heavy that multi-threaded version will behave nearly like single-threaded
one - initiated query must be completed before next one is able to run. It
could be related to locking behavior, non-deterministic deadlocking,
volume
of data referenced, availability of indexes or something else. That's main
reason why I say that gains could be not very significant, while not
negating fact that in some cases you might have impressive results.
Generally speaking it is very application dependent topic. I expect you
would agree that to thread inefficient queries is inefficient practice,
where you cannot expect significant gains. Did I mention profilers in my
original post?

Anyway, after all this reasoning I would go for multi-threaded solution.
Single-threaded DB access is the lest efficient solution which one can
choose.

Alex

Alvin Bruney said:
you definitely will save some time with threads, but do not expect much
savings.

I'm not sure why you think this. A properly implemented thread solution can
cut the time considerably at the expense of other resources ofcourse. OP is
right, the parallel execution will help considerably and it will be roughly
equal to the number of inserts per stack of queries plus a delta for
overhead related to context switching.

I have used this successfully where upwards of 4000 queries needed to be
executed. The trick was to set a bound on the max number of threads being
spawned (thru trial and error, my came out to be 7 threads). The stack of
queries was divided by the number of threads with the leftover going to
an
arbitrary thread. Each thread blasted its stack to the database
concurrently. My tests showed that 175 inserts or 300 updates could occur
per second on an informix engine without using stored procs. This was
used
to set the maximum stack of queries a thread could use. In addition, the
application had to run on its own server because this approach is extremely
cpu intensive (quad box). The database had to be tuned to wait on locked
rows - it's not typically the default. A trivial example for 5000 queries
would move 1000 queries per thread on a 5 thread max solution, which
would
take about 4 seconds. Sequentially, the queries would be done in about
1/2 a
minute, using the threading approach it was around 5 seconds - a
considerable improvement.

I wouldn't suggest this approach unless you knew exactly what you were doing
and had the gall to implement it correctly. I recall having to *beg Jon
Skeet to help me fix some nasty thread bugs that no one else seemed to know
how to fix...Thanks Jon.

--
Regards,
Alvin Bruney
[ASP.NET MVP http://mvp.support.microsoft.com/default.aspx]
Got tidbits? Get it here... http://tinyurl.com/27cok
AlexS said:
Sam,

you definitely will save some time with threads, but do not expect much
savings. You might gain much more if you will do validation before
going
into this loop, which - validation - seems to take most of time. And
you
might gain some more if validation will call one sp instead of 3 - create
a
wrapper sp.

However, hard facts are more preferable here. Did you profile your sps and
inserts? Maybe you need to optimize stored procs before going into
threads?

HTH
Alex

"(e-mail address removed)"
Dear John,

Thanks for your answer, but wouldn?t there be some gain in using
multithreading.

Our code right now is secuential:

It loops for every record, validating data against the database.
Before
the insert we are callling 3 storeprocedures which validate that the data
is
correct, and after that the insert is done.

The overall time for each record is around 4-5 seconds, so if if we have
15 records, it takes 60 seconds for the whole operation to take place.

We think that if we use multithreading we can paralel the execution of
the
method, thus saving some time since we don?t have to wait for each
operation
to take place and finish.

Regards,
Sam

:

"(e-mail address removed)"
Hello,

We have a vb.net dll which secuentially calls a method which inserts
a
record in our database.

Our user fills up a grid with 15 records, we create an in memory xml
which is passed to business tier. The business tier iterates through
this
xml calling a method in another dll which verifies some info and the
inserts
the record in the database.

The overall process takes around 40 seconds. We are trying to
implement a
solution which uses multitheading to execute the inserts in multiple
threads. This way we think we can gain some tieme in the overall
solution.

MultiThreading is unlikely to help you here. Your problem isn't that
you're
using up 100% of a single CPU, so having another thread run on a
different
CPU won't help you. MultiThreading does little for data-bound code.
 
As with all the above points, it seems that the bottleneck is your database
and this is where you would find more predictable performance gains. Multi
threading could cause more problems than it solves.. you have database
locking, thread synchronisation, transactions( just to mention a few) to
worry about. As mentioned before, multithreading where CPU usage is high
won't really help anyway.

If you cannot change the database or its configuration, you might instead
consider scaling out to use multiple application servers. Or you might need
to live with the timing of processing, but make your application
asynchronous.

Asynchronous distributed applications make the most efficient use of
resources. Of course, you would then need to manage the end users
expectations!

Good Luck anyway.

Dirc

Alvin Bruney said:
true.

i'd sit down and think about this or any other performance improvement
solution thoroughly before proceeding. For example, certain databases like
mysql cannot lock at the row level which makes threading pointless. Other
databases lock at the page level instead of the row level etc etc. Going
after performance involves understanding the underlying architecture
intimately. It's easy to put together an application that works, building
performance into an application takes expert knowledge.

--
Regards,
Alvin Bruney
[ASP.NET MVP http://mvp.support.microsoft.com/default.aspx]
Got tidbits? Get it here... http://tinyurl.com/27cok
AlexS said:
Alvin,

here my point was underlying database engine. Biggest delays are caused by
disk access, used queries and underlying implementation of query
processing.
You have some figures for Informix, which seem to be fine or impressive
for
you. However you mention quad - 4 CPUs? And I think you did single-row
inserts and updates, when indexes are very efficient, right?
On standard single-CPU desktop engine like MS SQL / MSDE can do work
faster
with threaded queries, but in certain cases query could be so
processor/disk
heavy that multi-threaded version will behave nearly like single-threaded
one - initiated query must be completed before next one is able to run. It
could be related to locking behavior, non-deterministic deadlocking,
volume
of data referenced, availability of indexes or something else. That's main
reason why I say that gains could be not very significant, while not
negating fact that in some cases you might have impressive results.
Generally speaking it is very application dependent topic. I expect you
would agree that to thread inefficient queries is inefficient practice,
where you cannot expect significant gains. Did I mention profilers in my
original post?

Anyway, after all this reasoning I would go for multi-threaded solution.
Single-threaded DB access is the lest efficient solution which one can
choose.

Alex

Alvin Bruney said:
you definitely will save some time with threads, but do not expect much
savings.

I'm not sure why you think this. A properly implemented thread solution can
cut the time considerably at the expense of other resources ofcourse.
OP
is
right, the parallel execution will help considerably and it will be roughly
equal to the number of inserts per stack of queries plus a delta for
overhead related to context switching.

I have used this successfully where upwards of 4000 queries needed to be
executed. The trick was to set a bound on the max number of threads being
spawned (thru trial and error, my came out to be 7 threads). The stack of
queries was divided by the number of threads with the leftover going to
an
arbitrary thread. Each thread blasted its stack to the database
concurrently. My tests showed that 175 inserts or 300 updates could occur
per second on an informix engine without using stored procs. This was
used
to set the maximum stack of queries a thread could use. In addition, the
application had to run on its own server because this approach is extremely
cpu intensive (quad box). The database had to be tuned to wait on locked
rows - it's not typically the default. A trivial example for 5000 queries
would move 1000 queries per thread on a 5 thread max solution, which
would
take about 4 seconds. Sequentially, the queries would be done in about
1/2 a
minute, using the threading approach it was around 5 seconds - a
considerable improvement.

I wouldn't suggest this approach unless you knew exactly what you were doing
and had the gall to implement it correctly. I recall having to *beg Jon
Skeet to help me fix some nasty thread bugs that no one else seemed to know
how to fix...Thanks Jon.

--
Regards,
Alvin Bruney
[ASP.NET MVP http://mvp.support.microsoft.com/default.aspx]
Got tidbits? Get it here... http://tinyurl.com/27cok
Sam,

you definitely will save some time with threads, but do not expect much
savings. You might gain much more if you will do validation before
going
into this loop, which - validation - seems to take most of time. And
you
might gain some more if validation will call one sp instead of 3 - create
a
wrapper sp.

However, hard facts are more preferable here. Did you profile your
sps
and
inserts? Maybe you need to optimize stored procs before going into
threads?

HTH
Alex

"(e-mail address removed)"
Dear John,

Thanks for your answer, but wouldn?t there be some gain in using
multithreading.

Our code right now is secuential:

It loops for every record, validating data against the database.
Before
the insert we are callling 3 storeprocedures which validate that the data
is
correct, and after that the insert is done.

The overall time for each record is around 4-5 seconds, so if if we have
15 records, it takes 60 seconds for the whole operation to take place.

We think that if we use multithreading we can paralel the execution of
the
method, thus saving some time since we don?t have to wait for each
operation
to take place and finish.

Regards,
Sam

:

"(e-mail address removed)"
Hello,

We have a vb.net dll which secuentially calls a method which inserts
a
record in our database.

Our user fills up a grid with 15 records, we create an in memory xml
which is passed to business tier. The business tier iterates through
this
xml calling a method in another dll which verifies some info and the
inserts
the record in the database.

The overall process takes around 40 seconds. We are trying to
implement a
solution which uses multitheading to execute the inserts in multiple
threads. This way we think we can gain some tieme in the overall
solution.

MultiThreading is unlikely to help you here. Your problem isn't that
you're
using up 100% of a single CPU, so having another thread run on a
different
CPU won't help you. MultiThreading does little for data-bound code.
 
Back
Top