Huge data needs to be transfer from Fixed width Text File to SQL S

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi Experts,
I have huge fixed width text files around 9GB or 60 TO 75 millions records.
I need to transfer 300GB data from these Text files to SQL Server 2000
monthly bases. I am trying to develop VB.NET application. One problem is I
need to transform this data, like date is 6 char (mmddyy) which needs to be
convert into real Date (mm/dd/ccyy) during the import. I tried to use
datareader to bring data into DataTable and then append data into SQL Server
using SQLBulkInsert but its taking long time and start failing after 10-16
millions records appended into SQL Server 2000 table.

I have following questions:-

1- Which class-method should I use to read the data from text file?
2- How to perform Transformation data ( like converting 6 char date fields
into DATE and other validations)?
3- Which Class-method should I use to append data into SQL Server 2000 to
get the best performance?
4- Is there any better way to done the same job?

I am using VB.NET 2005 Beta.

Thank you so much for your time and help.


Some code what I am using right now to do so

Using rdrDataFile As New TextFieldParser(mstrDataFilePath)


rdrDataFile.TextFieldType = FileIO.FieldType.FixedWidth
rdrDataFile.SetFieldWidths(objDataFile.SplitPositions)

arrNumToExclude.AddRange(objDataFile.ColumnIndicesToExclude)
tblDataFile = objDataFile.StagingTable
Dim currentRow As String()
Dim rowIndex As Integer = 0

destConn = New SqlConnection(strConnectionString)
'Creates Connection
destConn.Open() 'Opens Connection

' intTotalRowCount = 1
While Not rdrDataFile.EndOfData
Try
Dim myRow As DataRow = tblDataFile.NewRow
currentRow = rdrDataFile.ReadFields()
Dim currentField As String
Dim columnIndex As Integer = 0
Dim intCounter As Integer = 0
For Each currentField In currentRow
If Not arrNumToExclude.Contains(intCounter)
Then
myRow(columnIndex) =
objDataFile.FormatData(columnIndex, currentField)
columnIndex += 1
End If
intCounter += 1
Next
tblDataFile.Rows.Add(myRow)

Catch ex As Exception

End Try
rowIndex += 1
If rowIndex = 100000 Or rdrDataFile.EndOfData Then
'Upload data To Database in Batches
BulkCopyToDB(tblDataFile,
objDataFile.DestinationTable) 'Copy Data into DB
tblDataFile.Clear()
rowIndex = 0
End If

'vpbDataFile.Value = IIf(intTotalRowCount <
vpbDataFile.Maximum, intTotalRowCount, vpbDataFile.Value)
'intTotalRowCount += 1
End While

End Using



Public Sub BulkCopyToDB(ByVal vtblDataFile As DataTable, ByVal vstrDestTable
As String)
Dim bcp As SqlBulkCopy = Nothing

Dim strConnectionString As String = ""
Try

bcp = New SqlBulkCopy(destConn)
bcp.DestinationTableName = vstrDestTable
bcp.WriteToServer(vtblDataFile)

Catch ex As Exception
Finally
If Not bcp Is Nothing Then
bcp.Close()
End If
End Try
End Sub

Regards,
-Permood
 
PerMood,

Try this.

Use the SqlBulkCopy but give it a batchsize. That way, it will
transactionally rollback to a little before the point it failed at. (I
assume you don't need the whole copy to be transactional).

And then obviously build some logic to recover from the point it failed at.

- Sahil Malik [MVP]
Upcoming ADO.NET 2.0 book - http://tinyurl.com/9bync
 
Hi Experts,
I have huge fixed width text files around 9GB or 60 TO 75 millions
records.
I need to transfer 300GB data from these Text files to SQL Server 2000
monthly bases. I am trying to develop VB.NET application.

Why are not you using DTS?
What you're trying to accomplish is exactly their main purpose.

Massimo
 
Massimo,
For 2 reasons i am going to develop the Application,
1- Transform the data, because I have to change data into Date and money
from Char and I need to perform certain validation which is very hard to
implement in TQSL as i am using SQL Server 2000.
2- There are 35 scripts, which currently perform this operation, and this is
done manually, I need to make a application where i can schulede them and
make it more rebust and conditional as compare to TSQL functionality.

I am able to do so in VB.Net but the only problem i having now, Import time
increased 2-3 times as compare to bulk Insert in SQL Server.
 
For 2 reasons i am going to develop the Application,
1- Transform the data, because I have to change data into Date and money
from Char and I need to perform certain validation which is very hard to
implement in TQSL as i am using SQL Server 2000.

If you find TSQL is not the right choice for your job, you can use Windows
Scripting (VBScript or else) inside DTS packages; I'm not a VBScript
developer, but I'm quite sure it's powerful enough to convert text strings
into dates ;-)
2- There are 35 scripts, which currently perform this operation, and this
is
done manually, I need to make a application where i can schulede them and
make it more rebust and conditional as compare to TSQL functionality.

You can run a DTS package with any scheduling you like, and you can manage
the package workflow as accurately as you need; i.e., you can do a task
(like a data validation), then do other tasks (like mailing an
administrator, or executing a query on another DB, or beginning a data
transfer from DB A to DB B) based on the outcome of the first one.
Oh, and you can also write store stored procedures and call them from the
package, of course.
I am able to do so in VB.Net but the only problem i having now, Import
time
increased 2-3 times as compare to bulk Insert in SQL Server.

Of course; bulk insert (like DTS) uses its own bulk insertion APIs, which
are *way* faster than any ADO/ADO.NET/ODBC query-based application.

I suggest you have a better look inside SQL Server's DTS. They're exactly
what you're looking for, but lots of people wrongly think they can be used
only to import an Access DB into SQL Server; there's a lot more to them than
this.

By the way, have you set the database recovery mode to "bulk-logged"? This
can *really* save you time (and transaction logs size) on big data import
operations, as it makes the DB log only the whole operation in the
transaction logs, as opposed to logging every single row inserted.


Massimo
 
Back
Top