Parsing

  • Thread starter Thread starter Becky
  • Start date Start date
B

Becky

I have data that I had to get from .odf (I only have
reader) into my Access database.

I have text like the following:

085 041 WEST ALLIS 2601 S. 108TH STREET WEST ALLIS WI
532270000

This needs to parsed and put in the correct field:
ie. DC = 085
Sore = 041
Location = West Allis
Address = 2601 S. 108th Street
City = West Allis
State = WI
Zip = 523227-0000
Opening = in this case unknown

I have around 500 such records. Any idea how this can be
done? I know how to do Field1: Left$([text],3) but this is
complicated.

Thanks,
Becky
 
Becky said:
I have data that I had to get from .odf (I only have
reader) into my Access database.

I have text like the following:

085 041 WEST ALLIS 2601 S. 108TH STREET WEST ALLIS WI
532270000

This needs to parsed and put in the correct field:
ie. DC = 085
Sore = 041
Location = West Allis
Address = 2601 S. 108th Street
City = West Allis
State = WI
Zip = 523227-0000
Opening = in this case unknown

I have around 500 such records. Any idea how this can be
done? I know how to do Field1: Left$([text],3) but this is
complicated.

Thanks,
Becky

Are there any consistencies in the records? The data doesn't appear to be
positional, and it looks as though the number of blank-delimited "words"
that go into any particular field may vary. If that's the case, you're not
going to have much success parsing the data programmatically. If you can
state a set of rules to follow in the parsing, those rules can be
implemented in code; but if you can't, how can the computer do it?

One possible approach that may work for many -- but almost certainly not
all -- of the records is to take the first two "words" as fields DC and Sore
(Store?). Then take everything up to the first numeric digit as the
Location, and everything from the first numeric digit to one of the keywords
"street", "road", "avenue", etc. (and their abbreviations) to be the
Address. From the end of the Address to the first 2-character token is the
City, and the 2-character abbreviation is the State, followed by the Zip,
followed by the Opening (whatever that is).

This would be a very imprecise and error-prone parsing, but you're starting
with imprecise data.
 
What is ODF? Can you export a tab or comma delimited file.
IF you can, you can just import the data into access.

HS
 
Becky,
Dirk is certainly correct about parsing strings where there's no
consistency and thereby precluding a precise set of rules. If it
would help to be able to analyze each token in the string, you
could laboriously examine each token by using the Split function
using a blank as the delimiter. Then, the resulting array would
have all the components of the string from 0 to n.
Good luck.
Bill
 
Back
Top