Transaction scope aborting

  • Thread starter Thread starter phancey
  • Start date Start date
P

phancey

hi,

I have a dotnet program that moves data from tables in one database to
tables in another, then it moves the rows in those tables to other
tables, transforming them in the process. Seems to work fine most of
the time, but when I put it against a big dataset I get transaction
aborted and it does not match up with my coded timeouts so it seems to
be potentially a logging problem or something similar? I have tried
setting UpdateStatistics on the database to asynchronous just in case
but that hasn't fixed it. It is not a problem with the specific data/
database or server as I have run it against 2 different sets of data
on 2 different databases and servers. It might be an msdtc


I am using TransactionScope with connections that have MARS and async
set to true. The pattern goes something like this

using (TransactionScope outerTs = new
TransactionScope(TransactionScopeOption.Required,
TimeSpan.FromSeconds(3600)))
using (OldConnection oldConn = new OldConnection())
using (NewConnection newConn = new NewConnection())
{
newConn.DoInsert();
newConn.DoAnotherInsert();
oldConn.DoRead();

//asynchronously
for the first 4 tables
extractDelegate = table.Extract;
extractDelegate.BeginInvoke(oldConn, newConn);

While still tables left to process
{
WaitAny();
Reset the waithandle;
Set the next thread off using the recently vacated WaitHandle;
}

}

//synchronous
firstTable.Transform(null,newConn)

//asynchronously
for the first 4 tables
{
DependentTransaction dependent =
Transaction.Current.DependentClone(DependentCloneOption.BlockCommitUntilComplete);
transformDelegate = table.Transform;
transformDelegate.BeginInvoke(dependent, newConn);
}

while still tables left to process
{
WaitAny();
Reset the waithandle;
Set the next thread off using the recently vacated WaitHandle;
}

outerTs.Complete();
}


//this is a table specific proc
void Extract(OldConnection oldConn, NewConnection newConn, DataTable
dTable)
{
//this sets up a number of independent transactions to move the
data across from the old database to the staging area of the new one.
We're not bothered if it fails halfway and leaves committed data
because these tables are only a temporary area before we do the
transform. However it is possible that lots of these transactions get
completed BUT the original outerTs rolls back. Although we are not
using the connections sent in as parameters, I also have a version
that doesn't commit each table separately but does them as a batch,
this requires a persistent connection.

DataTable tmpTable = null;
try{
using (TransactionScope ts = new
TransactionScope(TransactionScopeOption.RequiresNew,
TimeSpan.FromSeconds(3600)))
using (OldConnection oldConn1 = new OldConnection())
using (NewConnection newConn1 = new NewConnection())
{
//truncates the temporary staging area for this table
newConn1.TruncateTempTable()

//gets the data from the old database into datatable
tmpTable = GetData(oldConn1, fromTrans, toTrans, idList);

//commits the truncation
ts.Complete();
}

using (OldConnection oldConn1 = new OldConnection())
{
TransformData(tmpTable, ref dTable); //in most cases this
just points newTable to tmpTable but in some cases it moves the data
across to a new dataTable having transformed it. In these cases it
will often call InsertIntoTemp during the Transform process as in one
case it was ending up with too many rows to process at one time.
}
catch (Exception ex)
{
if (dTable != null)
{
dTable.Clear();
dTable.Dispose();
}
throw;
}
finally
{
if (tmpTable != null)
{
tmpTable.Dispose();
tmpTable = null;
}
}

try
{
//independent transaction to SqlBulkCopy the data from
the data table into the temporary staging area
InsertIntoReportingTemp(newConn, dTable);
//independent transaction does an insert from the
temporary staging area.
InsertIntoMap(rptConn);
}
finally
{
if (dTable != null)
{
dTable.Clear();
dTable.Dispose();
}
}

}


//this is a table specific proc
void Transform(object trans, NewConnection newConn)
{
DependentTransaction dependent = (DependentTransaction)trans;
try
{
using (TransactionScope ts = new
TransactionScope(dependent, TimeSpan.FromSeconds(3600)))
{
newConn.InsertIntoReporting();
ts.Complete();
}
}
finally
{
dependent.Complete();
dependent.Dispose();
}

}

I have tried to simplify it to its basics but of course it is possible
that I've omitted the thing that's making it go wrong.

I have tried a number of variations. As displayed, it goes wrong when
it tries to do the asynchronous call to Transform. So I changed this
call to synchronous (minus the DependentTransaction stuff). This works
up to the final outerTS.Complete(). If I change the Extract to
synchronous as well then it still doesn't work (i.e no asynchronous
stuff at all just nested transactions which are independent). But it
works fine asynchronously and/or synchronously on smaller datasets -
so why?

There is one table - which if I exclude - the process works fine. It
runs in around 4 - 5 minutes. And if I change the program to ONLY
extract this table it also works fine. But the combination of all of
them fails. This "problem" table consists of 2000+ rows. It is one of
two tables that has the TransformData method overridden (called in the
Extract process). It converts the 2000+ rows into about 30 million
rows!! I have modified this to commit in chunks of about 200,000 rows
at a time but still it fails right at the end on the outerTS.Complete
despite the fact that those 30 million rows have been committed and
all of the staging area has been committed. Part of the problem may be
that the 30 million rows are then inserted into the final destination
tables as part of the Transform - maybe it just can't cope with this
volume of data? How can I make it cope? I can't commit this bit in
chunks because it is not just a staging area. Anyway, I don't really
understand why it fails on the first async insert after the Extract,
this doesn't seem to relate to numbers of records inserted and not
committed.......

thanks
Phil
 
hi,

I have a dotnet program that moves data from tables in one database to
tables in another, then it moves the rows in those tables to other
tables, transforming them in the process. Seems to work fine most of
the time, but when I put it against a big dataset I get transaction
aborted and it does not match up with my coded timeouts so it seems to
be potentially a logging problem or something similar? I have tried
setting UpdateStatistics on the database to asynchronous just in case
but that hasn't fixed it. It is not a problem with the specific data/
database or server as I have run it against 2 different sets of data
on 2 different databases and servers. It might be an msdtc

I am using TransactionScope with connections that have MARS and async
set to true. The pattern goes something like this

using (TransactionScope outerTs = new
TransactionScope(TransactionScopeOption.Required,
TimeSpan.FromSeconds(3600)))
using (OldConnection oldConn = new OldConnection())
using (NewConnection newConn = new NewConnection())
{
newConn.DoInsert();
newConn.DoAnotherInsert();
oldConn.DoRead();

//asynchronously
for the first 4 tables
extractDelegate = table.Extract;
extractDelegate.BeginInvoke(oldConn, newConn);

While still tables left to process
{
WaitAny();
Reset the waithandle;
Set the next thread off using the recently vacated WaitHandle;
}

}

//synchronous
firstTable.Transform(null,newConn)

//asynchronously
for the first 4 tables
{
DependentTransaction dependent =
Transaction.Current.DependentClone(DependentCloneOption.BlockCommitUntilComplete);
transformDelegate = table.Transform;
transformDelegate.BeginInvoke(dependent, newConn);
}

while still tables left to process
{
WaitAny();
Reset the waithandle;
Set the next thread off using the recently vacated WaitHandle;
}

outerTs.Complete();

}

//this is a table specific proc
void Extract(OldConnection oldConn, NewConnection newConn, DataTable
dTable)
{
//this sets up a number of independent transactions to move the
data across from the old database to the staging area of the new one.
We're not bothered if it fails halfway and leaves committed data
because these tables are only a temporary area before we do the
transform. However it is possible that lots of these transactions get
completed BUT the original outerTs rolls back. Although we are not
using the connections sent in as parameters, I also have a version
that doesn't commit each table separately but does them as a batch,
this requires a persistent connection.

DataTable tmpTable = null;
try{
using (TransactionScope ts = new
TransactionScope(TransactionScopeOption.RequiresNew,
TimeSpan.FromSeconds(3600)))
using (OldConnection oldConn1 = new OldConnection())
using (NewConnection newConn1 = new NewConnection())
{
//truncates the temporary staging area for this table
newConn1.TruncateTempTable()

//gets the data from the old database into datatable
tmpTable = GetData(oldConn1, fromTrans, toTrans, idList);

//commits the truncation
ts.Complete();
}

using (OldConnection oldConn1 = new OldConnection())
{
TransformData(tmpTable, ref dTable); //in most cases this
just points newTable to tmpTable but in some cases it moves the data
across to a new dataTable having transformed it. In these cases it
will often call InsertIntoTemp during the Transform process as in one
case it was ending up with too many rows to process at one time.
}
catch (Exception ex)
{
if (dTable != null)
{
dTable.Clear();
dTable.Dispose();
}
throw;
}
finally
{
if (tmpTable != null)
{
tmpTable.Dispose();
tmpTable = null;
}
}

try
{
//independent transaction to SqlBulkCopy the data from
the data table into the temporary staging area
InsertIntoReportingTemp(newConn, dTable);
//independent transaction does an insert from the
temporary staging area.
InsertIntoMap(rptConn);
}
finally
{
if (dTable != null)
{
dTable.Clear();
dTable.Dispose();
}
}

}

//this is a table specific proc
void Transform(object trans, NewConnection newConn)
{
DependentTransaction dependent = (DependentTransaction)trans;
try
{
using (TransactionScope ts = new
TransactionScope(dependent, TimeSpan.FromSeconds(3600)))
{
newConn.InsertIntoReporting();
ts.Complete();
}
}
finally
{
dependent.Complete();
dependent.Dispose();
}

}

I have tried to simplify it to its basics but of course it is possible
that I've omitted the thing that's making it go wrong.

I have tried a number of variations. As displayed, it goes wrong when
it tries to do the asynchronous call to Transform. So I changed this
call to synchronous (minus the DependentTransaction stuff). This works
up to the final outerTS.Complete(). If I change the Extract to
synchronous as well then it still doesn't work (i.e no asynchronous
stuff at all just nested transactions which are independent). But it
works fine asynchronously and/or synchronously on smaller datasets -
so why?

There is one table - which if I exclude - the process works fine. It
runs in around 4 - 5 minutes. And if I change the program to ONLY
extract this table it also works fine. But the combination of all of
them fails. This "problem" table consists of 2000+ rows. It is one of
two tables that has the TransformData method overridden (called in the
Extract process). It converts the 2000+ rows into about 30 million
rows!! I have modified this to commit in chunks of about 200,000 rows
at a time but still it fails right at the end on the outerTS.Complete
despite the fact that those 30 million rows have been committed and
all of the staging area has been committed. Part of the problem may be
that the 30 million rows are then inserted into the final destination
tables as part of the Transform - maybe it just can't cope with this
volume of data? How can I make it cope? I can't commit this bit in
chunks because it is not just a staging area. Anyway, I don't really
understand why it fails on the first async insert after the Extract,
this doesn't seem to relate to numbers of records inserted and not
committed.......

thanks
Phil

ok, this is definitely just a numbers thing. If I change my transform
so that, in the case of the problem table, it expands to just 5
million rows, then it works fine. But 30 million doesn't. But there is
no obvious error message. It just says "Transaction aborted". What is
limiting it? The system is running on SQL Server 2005, on a RAID
server with Logs on an array of 2 disks, Data on 3 disks and tempdb on
2 disks and the operating system on a single disk - one RAID
controller.

I am sure the logs are struggling but I would expect slowness not an
error.

Any ideas? The memory used did not seem to be a major problem.

thanks
Phil
 
Could 30 million rows be blowing your log? Do you have your log set to
auto-grow, or is it fixed? What size is it currently?
 
Could 30 million rows be blowing your log? Do you have your log set to
auto-grow, or is it fixed? What size is it currently?

hi there,

the log file is limited - but to 2,097,152 MB which I'm sure is way
over what it needs. The disk itself has 108GB of free space and the
log file is currently (after a 5million row import) around 12 GB. If
it WAS the log file, would there not be an error? Anyway, just to make
sure I changed it to unrestricted growth - it still failed but with no
error message apart from "The transaction has aborted". Nothing in the
SQL Server logs. And I repeated the 5 million row run, worked fine
again. I then inverted the transform (getting 25 million rows - the
difference) and this fell over too. I would say it's a problem with
MSDTC somehow, I might try setting up a test that does it without the
need for separate connections and separate transactions, just do it
all in one big transaction.

Any other ideas welcome because it's taken me ages to find out it's
just dependent on numbers of records and not something inherently
wrong in the code.

thanks
Phil
 
hi there,

the log file is limited - but to 2,097,152 MB which I'm sure is way
over what it needs. The disk itself has 108GB of free space and the
log file is currently (after a 5million row import) around 12 GB. If
it WAS the log file, would there not be an error? Anyway, just to make
sure I changed it to unrestricted growth - it still failed but with no
error message apart from "The transaction has aborted". Nothing in the
SQL Server logs. And I repeated the 5 million row run, worked fine
again. I then inverted the transform (getting 25 million rows - the
difference) and this fell over too. I would say it's a problem with
MSDTC somehow, I might try setting up a test that does it without the
need for separate connections and separate transactions, just do it
all in one big transaction.

Any other ideas welcome because it's taken me ages to find out it's
just dependent on numbers of records and not something inherently
wrong in the code.

thanks
Phil

tested with no asynchronous threads. Just a simple transaction
spanning 2 connections (though one of them only has reads). Got error
The Microsoft Distributed Transaction Coordinator (MS DTC) has
cancelled the distributed transaction. after writing 10,290,149
successfully, then trying to write the next batch of 2,155,299 (using
SQLBulkCopy with batchsize of 1000.) No other message, which is
infuriating!!

Phil
 
OK I have now used only simple transactions, no use of MSDTC. This
worked! On 30 million rows. So, it could be the way I'm using MSDTC
but then why would it work on the 5 million row example? Anyone know
of MSDTC bugs that could be causing the problem?

thanks
 
Back
Top