Checking a CSV file.

  • Thread starter Thread starter Smith
  • Start date Start date
S

Smith

Hello,

I'm wrting a program that will take a CSV file and perform a number of check
on the fields in the lines.
The field values can be strings and numbers .
Can someone advice on the best approach to deal with this?

Many thanks in advance

S
 
It's a little difficult unless you actually tell the group what checks you
intend to perform...

Well,

I'm just thinking of the way to arrange the logic that does the checks the
checks. The actual check could be a range check on the numbers, or looking
up a name in a database. But i thought this would not really matter. Well,
it is obvious that i could write a method to check each field but this does
not sound "Generic" to me. I think there must be more clever approches maybe
something similar to a way compilers parse code lines?

No worries about the actual CSV parsing of the file as i will use linq to
do this in the way shown below:

varquery = from line in
File.ReadAllLines(FileName)where!line.StartsWith("#")letparts=
line.Split(',')select

new{

ISBN = parts[0],

Title= parts[1],

Publisher = parts[2],



Cheers

S
 
I'm just thinking of the way to arrange the logic that does the checks the
checks. The actual check could be  a range check on the numbers, or looking
up a name in a database. But i thought this would not really matter. Well,
it is obvious that i could write a method to check each field but this does
not sound "Generic" to me. I think there must be more clever approches maybe
something similar to a way compilers parse  code lines?

If you need to do a distinct check for every field, then you have to
do a distinct check for every field. If it's something sufficiently
complicated, it may be worth refactoring it into a separate method,
and calling that for all fields it applies to (parametrizing as
needed) - that's about as generic as it gets. For simple range checks,
even that is probably superfluous.
No worries about the actual  CSV parsing of the file as i will use linqto
do this in the way shown below:

varquery = from line in
File.ReadAllLines(FileName)where!line.StartsWith("#")letparts=
line.Split(',')select

 new{

ISBN = parts[0],

Title= parts[1],

Publisher = parts[2],

You're aware that this reads the entire file into memory at once,
right? It may not be a problem, depending on the size of the file, but
still worth keeping it in mind.

Since you're working with specific fields here anyway (you enumerate
them inside "new {...}"), it would seem that doing the same for
verification is the most reasonable approach.
 
I'm just thinking of the way to arrange the logic that does the checks the
checks. The actual check could be a range check on the numbers, or looking
up a name in a database. But i thought this would not really matter. Well,
it is obvious that i could write a method to check each field but this
does
not sound "Generic" to me. I think there must be more clever approches
maybe
something similar to a way compilers parse code lines?

If you need to do a distinct check for every field, then you have to
do a distinct check for every field. If it's something sufficiently
complicated, it may be worth refactoring it into a separate method,
and calling that for all fields it applies to (parametrizing as
needed) - that's about as generic as it gets. For simple range checks,
even that is probably superfluous.
No worries about the actual CSV parsing of the file as i will use linq to
do this in the way shown below:

varquery = from line in
File.ReadAllLines(FileName)where!line.StartsWith("#")letparts=
line.Split(',')select

new{

ISBN = parts[0],

Title= parts[1],

Publisher = parts[2],

You're aware that this reads the entire file into memory at once,
right? It may not be a problem, depending on the size of the file, but
still worth keeping it in mind.


Since you're working with specific fields here anyway (you enumerate
them inside "new {...}"), it would seem that doing the same for
verification is the most reasonable approach.


1)Yes, you are correct í will fix it with the following:
public static classStreamReaderEnumerable{public static IEnumerable<string>
Lines(this StreamReader source){stringline;if (source == null) throw new
ArgumentNullException("source");while((line = source.ReadLine()) !=
null)yieldreturnline;}



2) please explain furether. I did not get it
 
Assuming you're also working with a database, you could load the file into a
staging table and use SQL for your testing.
 
Hello,

I'm wrting a program that will take a CSV file and perform a number of check
on the fields in the lines.
The field values can be strings and  numbers .
Can someone advice on the best approach to deal with this?

Many thanks in advance

S

Create a class that takes a string as a constructor.

Create a property for each field in the CSV file.

I also usually add a bool IsBad property to the class. Use the IsBad
to validate the data.

List<MyRecord> mListOfRecords = new List<MyRecord>();

....
loop through records
mListOfRecords.add(new MyRecord(theLine));
 
Back
Top