String Parsing Query

  • Thread starter Thread starter David Webb
  • Start date Start date
D

David Webb

Hi,

A client is providing me with a CSV and one of the fields is
contactenated. It contains Suburb, State and Postcode in the one
field, but I need it separated. E.g. I'm being sent

Calamvale Qld 4116

To get the post code, I can just grab the right 4 digits but I'm not
sure how I can separate state and post code. I can't use spaces as if
I'm given the following

Mt Gravatt Qld 4122

then I'd end up with suburb of Mt, state of Gravatt. Is there any easy
way that I can parse this string that I'm being sent?

Thanks in advance for any assistance.

Kind Regards,

David.
 
I'd use a Regular Expressions. That one should do the trick:

string suburb, state;
int postcode;

Match match =
Regex.Match(@"^(?<suburb>[\w ]+)[ ]+(?<state>[^\d]+)(?<postcode>\d+)$");
if (match.Success)
{
suburb = match.Group["suburb"].Value;
state = match.Group["state"].Value;
postcode = int.Parse(match.Group["postcode"].Value);
}


Greetings, Christian
 
Hi,
you could do a 'split' using space as the delimiter, something like...

Dim strItems() As string
Dim iIndex as integer
stritems = split(address," ")
iIndex = ubound(iIndex)
PostCode = strItems(iindex)
State = strItems(iindex-1)
for ict = 0 to iindex -2
suburb = suburb & strItemx(ict)
next

Pete


--
Pete Vickers
Microsoft Windows Embedded MVP
HP Business Partner
http://www.gui-innovations.com

Do have an opinion on the effectiveness of Microsoft Windows Mobile and
Embedded newsgroups? Let us know!
https://www.windowsembeddedeval.com/community/newsgroups
 
David,

I suspect you are thinking left to right. In this case, the spaces in the
suburb name are a problem.

However, if you think right to left, the problem becomes simpler.

Postcode = rightmost 4 chars
State = rightmost chars 8 to 6
Suburb = all leftmost chars less 9 of 'em

In VB code, it would look something like this

strx = "Mt Gravatt Qld 4122"
intlenx = len(strx)
strPC = right(strx,4)
strState = mid(strx, intlenx-8, 3)
strSuburb = left(strx, intlenx-9)

I have not tested this, but it looks correct. Might have to watch out for
two char states... ;-)

- Saul
 
David Webb said:
A client is providing me with a CSV and one of the fields is
contactenated. It contains Suburb, State and Postcode in the one
field, but I need it separated. E.g. I'm being sent

Calamvale Qld 4116

To get the post code, I can just grab the right 4 digits but I'm not
sure how I can separate state and post code. I can't use spaces as if
I'm given the following

Mt Gravatt Qld 4122

then I'd end up with suburb of Mt, state of Gravatt. Is there any easy
way that I can parse this string that I'm being sent?

Thanks in advance for any assistance.

Use String.LastIndexOf(char) to find the last space (ie before the post
code). Use String.LastIndexOf(char, int) to find the space before that.
 
All,

I find it interesting that such a simple question can have such varied
responses...

Of course, I like my solution best ;-) but I can see how the others may be
better depending on your perspective.

David, did you sort it out? Which code did you use??

Regards,
Saul
 
Hi Saul,

I agree with you - I liked your solution the best! For no other reason
than it seemed to make sense to me and had me thinking "outside the
square" - as you say, think right to left. One of the other solutions
had me looking for spaces but suburbs with spaces would cause
problems.

There was an example in c# but my feeble brain couldn't decipher it.
Thanks to all who posted though - as always, the help was greatly
appreciated.

Kind Regards,

David.
 
Back
Top