Ron Rosenfeld said:
I have huge sheet with thousands of non-standard lines of text. Some of the
lines have the following line in a cell:
ProjID=MFG-Company-DA-33 ProjType=BBAA Uplift=0 CostType=Time Mgr=Smith
I would like to understand how to write the VBA that will skip the lines
that don't match this format and then Break out the items that do. This is
what I have so far.
some sort of loop statement here
If activecell.value (left(5))="projID" then
' this is where a need the help. I want to move everything after ProjID= to
Type= to the next column then move everything after Type= to Uplift= to a
column 2 columns away and so on for the rest of the line.
else activecell.offset (-1,0)
loop
End if
Thanks for the help. I would use text to columns, but only few of the
cells follow the format listed above. Thanks
You haven't specifically defined your format, and I'm not clear if you want the
label to be extracted also, but I made these assumptions to attract the various
segments:
Each segment to be extracted is preceded by a specific ?????= and followed by a
<space>.
The ?????= labels all occur in the order given, and there are five of them in
each relevant line.
If the pattern doesn't match, extract nothing.
That being the case, the macro below will extract the relevant data, from lines
that match the format above:
========================================
Option Explicit
Sub SplitSome()
Dim rg As Range
Dim c As Range
Dim i As Long
'set up the range to process
'can be done in a variety of ways
'below assumes the range is contiguous
'and in one column
Set rg = Selection(1, 1)
Set rg = Range(rg.End(xlUp), rg.End(xlDown))
'set up pattern for regex engine
Dim re As Object, mc As Object
Dim Str As String
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern =
"ProjID=(.*?)\s+ProjType=(.*?)\s+Uplift=(.*)\s+CostType=(.*?)\s+Mgr=(.*)\s*"
'check all lines
For Each c In rg
Str = c.Value
'if the pattern doesn't match, will return nothing
If re.test(Str) = True Then
Set mc = re.Execute(Str)
'there must be five segments
For i = 1 To 5
c.Offset(0, i * 2).Value = mc(0).submatches(i - 1)
Next i
End If
Next c
End Sub
=========================================
--ron
I found this very interestin ron and have several uses for it not sure i Full understand waht is happening here if you could send me some linksto decifer this ,[ "ProjID=(.*?)\s+ProjType=(.*?)\s+Uplift=(.*)\s+CostType=(.*?)\s+Mgr=(.*)\s*"], it would be much appreciated
The line split, and the "full line" would be:
re.Pattern = _
"ProjID=(.*?)\s+ProjType=(.*?)\s+Uplift=(.*)\s+CostType=(.*?)\s+Mgr=(.*)\s*"
It is a regular expression pattern.
Here are some references regarding regular expressions (I'm not sure if they
are all still good):
Regular Expressions
http://support.microsoft.com/default.aspx?scid=kb;en-us;818802&Product=vbb
http://msdn2.microsoft.com/en-us/library/6wzad2b2.aspx
http://www.regex-guru.info/
http://www.regular-expressions.info/reference.html
http://msdn2.microsoft.com/en-us/library/ms974619.aspx
Here is a formal explanation of that regex:
ProjID Regex
ProjID=(.*?)\s+ProjType=(.*?)\s+Uplift=(.*)\s+CostType=(.*?)\s+Mgr=(.*)\s*
Match the characters “ProjID=” literally «ProjID=»
Match the regular expression below and capture its match into backreference
number 1 «(.*?)»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as
needed (lazy) «*?»
Match a single character that is a “whitespace character” (spaces, tabs, line
breaks, etc.) «\s+»
Between one and unlimited times, as many times as possible, giving back as
needed (greedy) «+»
Match the characters “ProjType=” literally «ProjType=»
Match the regular expression below and capture its match into backreference
number 2 «(.*?)»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as
needed (lazy) «*?»
Match a single character that is a “whitespace character” (spaces, tabs, line
breaks, etc.) «\s+»
Between one and unlimited times, as many times as possible, giving back as
needed (greedy) «+»
Match the characters “Uplift=” literally «Uplift=»
Match the regular expression below and capture its match into backreference
number 3 «(.*)»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back
as needed (greedy) «*»
Match a single character that is a “whitespace character” (spaces, tabs, line
breaks, etc.) «\s+»
Between one and unlimited times, as many times as possible, giving back as
needed (greedy) «+»
Match the characters “CostType=” literally «CostType=»
Match the regular expression below and capture its match into backreference
number 4 «(.*?)»
Match any single character that is not a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as
needed (lazy) «*?»
Match a single character that is a “whitespace character” (spaces, tabs, line
breaks, etc.) «\s+»
Between one and unlimited times, as many times as possible, giving back as
needed (greedy) «+»
Match the characters “Mgr=” literally «Mgr=»
Match the regular expression below and capture its match into backreference
number 5 «(.*)»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back
as needed (greedy) «*»
Match a single character that is a “whitespace character” (spaces, tabs, line
breaks, etc.) «\s*»
Between zero and unlimited times, as many times as possible, giving back as
needed (greedy) «*»
Created with RegexBuddy
--ron