Fastest way to import large fixed width file into a DataTable

BostonNole · Nov 26, 2007

Using Visual Studio 2005, .NET 2.0 and VB.NET:

I am looking for the fastest possible way to import a very large fixed
width file (over 6 million records and over 1.2 GB file size) into a
DataTable.

Any suggestions?

Kerry Moorman · Nov 26, 2007

BostonNole,

What are you planning to do with the data once it is in a datatable?

Kerry Moorman

Patrice · Nov 26, 2007

What is the goal ?

A datatable is supposed to hold as much data as a user really needs (not a
whole db or table). If you want to import into SQL Server I would rather see
the SqlBulkInsert capability (external in 1.1, included as part of ADO.NET
in 2.0) rather than using a datatable as an intermediate step...

BostonNole · Nov 26, 2007

I need to run some validation rules, formatting rules and business
rules against the data, then dump it out to another text file.

BostonNole · Nov 26, 2007

SqlBulkInsert ??? I think you mean SQLBulkCopy class.

SQL Server or any other relational database is not an option.

Patrice · Nov 26, 2007

I saw in the other thread this is to check the file. What kind of checks ?
Basically :

- either you can check on a row per row basis in which case you could
perhaps read them using buffered reads and check from there (avoiding to
load the whole thing in memory as a single huge chunk), You could also does
what you can on a row per row basis and keep just what is needed for row
against row comparison (for example if you need to make sure you have unique
values, you have just to keep the pk field, keeping other fields is not
needed). Also do you need a datatable or could it be a data structure of
your own ?

- my personal preference for the second option would be to load this in a db
so that I can perform whatever sql statements could allows to check this in
case you would need to check a row against another row (not an option in
your case) or any kind of queries you could have to do

- my last resort would be to load inside a datatable (and once loaded you'll
still have to check).

My first step would be first to see what I can check while reading and what
I need to really keep for further checks...

cfps.Christian · Nov 26, 2007

Using Visual Studio 2005, .NET 2.0 and VB.NET:

I am looking for the fastest possible way to import a very large fixed
width file (over 6 million records and over 1.2 GB file size) into a
DataTable.

Any suggestions?

You could possibly write a SQL function that will do the data
massaging and call the function via code and let the SQL server do all
the work.

While I don't work with the size files you're dealing with generally
when I have to work with a flat file I convert the file to an object
and in the object constructor I massage my data and then use the
object to do the insertion. Granted you're running through the list
twice but the insert goes really fast once its out of the file.

Cor Ligthert[MVP] · Nov 26, 2007

Hallo,

I need to run some validation rules, formatting rules and business
rules against the data, then dump it out to another text file.

Can you give me an idea what is the sense of this, it sounds for me as going
from here to Bruxelles via Tokio. Probably not even a nice trip.

Cor

BostonNole · Nov 26, 2007

Can't do SQL call. SQL and other relational databases are not an
option.

The goal is simple....find the fastest most efficient way to import a
very large fixed width text file into some type of object (preferrably
a DataTable) in .NET.

Tom Shelton · Nov 26, 2007

I need to run some validation rules, formatting rules and business
rules against the data, then dump it out to another text file.

With the size of file your talking about, moving it all into memory is
probably not going to be a good option - and that is exactly what your
talking about by loading it into a DataTable.

So, the question becomes - do your validation rules depend on data
from other records in the file, or can each record be treated
individually? Because if they can be treated individually, then I
would consider reading the groups of records (you said it was fixed
width file), processing them, and writing them out to the new file,
repeat until finished. Most likely, the slow part of this is going to
be your reads and writes... So, by reading and writting them in
chunks (say building your output record in a stringbuilder) you can
minimize the number of disk hits.

BostonNole · Nov 26, 2007

Thank you Tom, I think this is what I was leaning toward as well.

Best Way to Import Fixed Width Files	4	Mar 8, 2006
Large text file import: MVP question	12	May 7, 2006
Fastest way to search a dataset.	2	Apr 22, 2005
Importing ascii file into datatable	4	Jan 6, 2005
Fastest way to sort	14	Oct 30, 2009
CSV File Import	11	Mar 17, 2006
Fastest way to append text to a TextBox	12	Apr 2, 2008
Fixed width file import	2	Sep 21, 2007

Fastest way to import large fixed width file into a DataTable

BostonNole

Kerry Moorman

Patrice

BostonNole

BostonNole

Patrice

cfps.Christian

Cor Ligthert[MVP]

BostonNole

Tom Shelton

BostonNole

Ask a Question

Similar Threads