Parsing in Access

  • Thread starter Thread starter Zardoz
  • Start date Start date
Z

Zardoz

Damit! I try this update in a query and it randomly cuts off text!!

UPDATE Pathfindermerge SET Pathfindermerge.Street =
trim(InStr([Address]," ")-1);

sample field looks like this:

MARLEEN HOUSTON

what happens is for example in the case of the city "Houston", I wind
up with something like "ouston" or if its "Pasadena", I wind up with
"adena". The program seems to be electively discounting the consonants.


What I really need is something to parse this field:

1218 FRESA PASADENA, TX 775020000
442 COUNTY ROAD 470 Apt# ETOILE, TX 75944-6408
5128 PINE AVENUE PINE TREE ASSISTED LIVIN PASADENA, TX 775030000
^note the spaces that occur ^I guess
the four spaces denoted a field

and so on... into seperate building number, apartment number, street
info, City, State, Zip and Plus 4 fields. What commands would I use to
identify a character sequence, such TX or CA or Apt and begin capturing
that string until it hits the next occuring "SPACE"? Somwhere, someone
knows how to do it, maybe a god or alien being or bionic man but
certainly not me!!!
 
The Trim function removes preceeding or trailing spaces. That's it.

However, I'm surprised you're not getting an error with that SQL statement,
as InStr([Address], " ") is going to give you a number (the position of the
first space in Address, or 0 if there are no spaces in it), and Trim
shouldn't work on a numeric field.

You've definitely got a non-trivial problem. If there are always 4 spaces
before the city, state and zip, you should be able to get that without too
much problem (Mid([Address], InStr([Address], " ") + 4) should give you
the city, state and zip code. Splitting that at the comma should give you
city and state, and if you know it's always a 2 character state
abbreviation, that should be easy to get out). The rest, though, will be
tricky.

To loop through a string, look at each character in turn, you'd use
something like:

Dim intLoop As Integer
Dim strChar As String

For intLoop = 1 To Len(strAddress)
strChar = Mid(strAddress, intLoop, 1)
' strChar will now be a character from the address

Next intLoop
 
As Doug says, it's a non-trivial problem. It is quite likely that there
will be ambiguities in the data that make it impossible to achieve a
100% solution.

But you can get a long way by using regular expressions. For instance,
this pattern (expanded by adding more states if needed)
"^.+\s{2,4}(\w+.*?),?\s+(AZ|LA|NM|TX)\s+(\d+-?\d*)\s*$"
will extract the city, state and zip code from your samples.

There's a sample function here which uses a regular expression to parse
many US "street" addresses (in your data, the part before the city
name), though it will need modification to handle your samples):
http://www.j.nurick.dial.pipex.com/Code/vbRegex/ParseAddress36.htm

Links on that page take you to more functions and information about
using regular expressions to parse text data.

Damit! I try this update in a query and it randomly cuts off text!!

UPDATE Pathfindermerge SET Pathfindermerge.Street =
trim(InStr([Address]," ")-1);

sample field looks like this:

MARLEEN HOUSTON

what happens is for example in the case of the city "Houston", I wind
up with something like "ouston" or if its "Pasadena", I wind up with
"adena". The program seems to be electively discounting the consonants.


What I really need is something to parse this field:

1218 FRESA PASADENA, TX 775020000
442 COUNTY ROAD 470 Apt# ETOILE, TX 75944-6408
5128 PINE AVENUE PINE TREE ASSISTED LIVIN PASADENA, TX 775030000
^note the spaces that occur ^I guess
the four spaces denoted a field

and so on... into seperate building number, apartment number, street
info, City, State, Zip and Plus 4 fields. What commands would I use to
identify a character sequence, such TX or CA or Apt and begin capturing
that string until it hits the next occuring "SPACE"? Somwhere, someone
knows how to do it, maybe a god or alien being or bionic man but
certainly not me!!!
 
Back
Top