Bug in SSIS 2005 - or is it a feature?

  • Thread starter Thread starter John Kotuby
  • Start date Start date
J

John Kotuby

Hi all,

I was trying to accomplish I thought would be a simple task as part of a
much larger data migration. Import a comma delimited CSV file into a SQL
table. As it turns out, if there is a bad-data row in the source CSV the
data-flow task does not simply pass that row via the "error" workflow arrow
into another Row Count task meant to keep track of the error count. It then
should move on to the next record (that was easy to configure in DTS 2000).

Instead the whole Data Flow task is failed. It appears that I don't even get
a chance to intercede in the event of a bad source record. I read a few
posts about older packages failing after .NET 3.5 SP1 is installed.

Apparently the PrimeOutput() method of the data-flow task simply fails the
task upon encountering a malformed record

I tried redirecting failed rows to a Row Count transform which redirects the
output to a variable of INT32.
The red line from the mdc_msc.csv source is even labelled "Flat File Source
Error Output".

However, the package is stall halted with a Fatal error.
It is clear to me after looking at the CSV file in a text editor that on row
1032 (of 155,000) there is a
missing comma delimeter. Of course I could fix that line only to discover
another 20 or so bad records.

This package must be run daily to update a working database. It must run
unattended and merely keep track of failed rows.

I get the "input" files from a 3rd party so cannot fix whatever might be
causing the error in the CSV file used for input.

Below I have typed the error messages.

Error: The column delimeter for column "Column 13" was not found.
Error: An error occurred while processing file "mdc_msc.csv" on data row
2003.
Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on
component "Source -
mdc_msc_csv"(1) returned error code 0xC0202092. The componenet returned a
failure code when the pipeline
engine called PrimeOutput(). The meaning of the failure code is defined by
the component, but the error is
fatal and the pipeline stopped executing. There may be error messages posted
before this with more
information about the failure.
Error: SSIS Error code DTS_E_THREADFAILED. Thread "SourceThread0" has exited
with error code 0xC0047038.

Thanks for any comment. I wish I could present this to a Microsoft SSIS
expert.
 
An expert said to direct the error output to a Row Count task. Unfortunately
that does not work. The failure to "prep" the input souce row apparently
causes the Data Flow task to abend. This really does appear to be a bug, if
there is no way to capture the error at the source in a Data Flow task.

Just wondered if anybody in this group has run across the same problem. I am
putting this out there so that if someone does have a similar problem, they
will know it's probably not due to a coding error on their part.

A work-around is to run a Bulk Insert task (rather than DataFlow) with the
help of a Format file. However, that method does not produce any Error File
output.
 
Thank you Alexey,
I will try to post in the proper groups next time.

I have scanned the forum posting and what I get out of it is that I may have
to run a Script task to examine each row for errors (meaning I need to
anticipate what may be wrong with the source data -- a difficult
maybe impossible task) and then route the rows to a certain pipeline
depending on the results of the Script task

In the data I was using there appeared to be 1 bad row out of 150,000. The
bad data was caused by too many quotes surrounding a character field. That
apparently threw off the Field synchronization between the source and the
meta-data table created by the package. In my opinion that occurrence should
then spit the row out through the Error path (the little red arrow) to
whatever garbage collector I wish to use, and not cause the task to fail
completely.

I admit that there is probably a lot that I don't understand about SSIS
2005, however I think my take on the situation is just good common sense.
Let's just say that if I were coding a tool like this, a malformed row would
be spit and recorded...simple as that.

I have seen numerous posts about slight variations in SQL Service packs or
even just patches that will cause a SSIS package to run on one machine and
fail on another. Whatever happened to backward compatability within the same
Product generation?

Thanks for any comment. I wish I could present this to a Microsoft SSIS
expert.

Although it is a wrong group, did you already see this?
http://social.msdn.microsoft.com/Fo.../thread/674a1ce3-fe1b-4d14-81cb-ea235fbf8855/
 
I believe there is a data transfer task property to set the allowed number
of failures. It defaults to 0, so any error cancels the task, but I seem to
remember you could increase that limit. Then you'll get the error rows in
the error output, and the good rows should be processed.

You could also import the data into a pre-qualifying table with a single
varchar field, if you can check for erroneous rows with sql.
 
Back
Top