Word: select text, search, delete, in a macro, how?

  • Thread starter Thread starter Oggy
  • Start date Start date
O

Oggy

I want to record a macro where I search for the beginning of some text
I'm interested in, select a section of text, search for the end of the
text I'm interested in, and then delete the resulting selection. The
trouble is that the section can vary in size. For example, I would
want to delete everything between <div> and </div>:

<div id="productdetails">
<table border="0" cellspacing="0" cellpadding="0">
<tr>
<td class="header" colspan="2">Product Details</td>
</tr>
<tr valign="top">
<td class="fieldlabel" nowrap="1">Dewey</td>
<td class="fieldvalue">813</td>
</tr>
<tr valign="top">
<td class="fieldlabel" nowrap="1">Format</td>
<td class="fieldvalue">Paperback</td>
</tr>
<tr valign="top">
<td class="fieldlabel" nowrap="1">Cover Price</td>
<td class="fieldvalue">$1.99</td>
</tr>
<tr valign="top">
<td class="fieldlabel" nowrap="1">No. of Pages</td>
<td class="fieldvalue">352</td>
</tr>
<tr valign="top">
<td class="fieldlabel"
nowrap="1">Height x Width</td>
<td class="fieldvalue">
<font size="+0">192
x
126
mm
</font>
</td>
</tr>
</table>
</div>


The trouble I find with Word is that I can't seem to mark the beginning
of the text I'm interested in, search for the end, and then delete that
section. I need to do this with keystrokes as I record the steps for
a macro.

Can anyone help?
 
Hi Oggy,

Word can probably do what you want with a wildcard-based Find/Replace operation. For example:
Sub CleanUp()
With Selection
.HomeKey Unit:=wdStory
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id)(*)(\<\/div\>)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
End With
End Sub

However, you haven't been clear about what distinguishes the string you posted from similar strings you want to retain. The above
macro will delete all such strings in the document.
 
macropod said the following on 8/11/2009 2:30 AM:
Word can probably do what you want with a wildcard-based Find/Replace
operation. For example:
Sub CleanUp()
With Selection
.HomeKey Unit:=wdStory
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id)(*)(\<\/div\>)"
.Replacement.Text = "" ...
....

However, you haven't been clear about what distinguishes the string you
posted from similar strings you want to retain. The above macro will
delete all such strings in the document.

Thanks for the help! Thankfully, every section that I would want to
delete has a unique "id=" For instance, in the following..

<div id="productdetails">
<table border="0" cellspacing="0" cellpadding="0">
<tr>
<td class="header" colspan="2">Product Details</td>
</tr>
<tr valign="top">
<td class="fieldlabel" nowrap="1">LoC </td>
<td class="fieldvalue">CPBBoxno.1797 .vol. 19</td>
</tr>
<tr valign="top"> ...
</table>
</div>
</td>
<td valign="top" width="50%">
<div id="personaldetails">
<table border="0" cellspacing="0" cellpadding="0">
<tr>
<td class="header" colspan="2">Personal Details</td>
</tr>
<tr valign="top">
<td class="fieldlabel" nowrap="1">Links</td>
<td class="fieldvalue">
<br/>
<br/>
</td>
</tr>
</table>
</div>

...I want delete the table data between:

<div id="productdetails"> .. </div>, and

<div id="personaldetails"> .. </div>.


Therefore, I suppose I can replace your text above with:

..Text = "(\<div id="productdetails")(*)(\<\/div\>)" ???


...and repeat for:

..Text = "(\<div id="personaldetails")(*)(\<\/div\>)" ???
 
Hi Oggy,

Yes, you could do something like that. For flexibility, you might use something along the lines of:
Sub CleanUp()
Dim StrID as String
StrID = InputBox("Please input the ID string to delete")
With Selection
.HomeKey Unit:=wdStory
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id=""" & StrID & """)(*)(\<\/div\>?)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
End With
End Sub

The final '?' in the Find expression causes the macroto delete the character following the closing 'div' expression - this would
ordinarily be a paragraph mark or line-feed.
 
macropod said the following on 8/11/2009 10:44 AM:
Yes, you could do something like that. For flexibility, you might use
something along the lines of:
Sub CleanUp()
Dim StrID as String
StrID = InputBox("Please input the ID string to delete")
With Selection
.HomeKey Unit:=wdStory
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id=""" & StrID & """)(*)(\<\/div\>?)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
End With
End Sub

Hey.. thanks again. That works great. But I don't really need the
prompt. I tried modifying the code to do specific id=strings but the
macro didn't do anything. This is my code:

Sub CleanUp()
With Selection
.HomeKey Unit:=wdStory
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id=""" & personalmain & """)(*)(\<\/div\>?)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id=""" & productdetails & """)(*)(\<\/div\>?)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id=""" & personaldetails & """)(*)(\<\/div\>?)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
End With
End With
End Sub
 
Hi Oggy,

Try:
Sub CleanUp()
With Selection
.HomeKey Unit:=wdStory
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(\<div id=""personalmain"")(*)(\<\/div\>?)"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
.Execute Replace:=wdReplaceAll
.Text = "(\<div id=""productdetails"")(*)(\<\/div\>?)"
.Execute Replace:=wdReplaceAll
End With
End With
End Sub
 
That works really well. Thanks! I learned a lot about macros.

I built the macro call in a DOS batch file since I have to apply it to
over a hundred files every week. Do you know to automatically save the
file and close the macro so that the DOS loop can proceed to the next
file? Right now, when Word is launched with the macro, the file stays
open after the macro runs. I would like to automatically save the
file, and close it.


macropod said the following on 8/11/2009 6:36 PM:
 
Hi Oggy,

Assuming this is the only open document, you could add:

ActiveDocument.Save
Application.Quit

to the macro, before the 'End Sub' line.
 
Thanks so much, macropod! Works great. Now my task that would
typically take about an hour runs automagically in less than 4 minutes.
The only thing I had to kludge was the file extension. Word likes to
munge .html files and make them unsuitable for simple text editing.
So, I rename all the *.html files to *.txt and then run the macros.



macropod said the following on 8/12/2009 6:04 AM:
 
Back
Top