How to extract text from text boxes

  • Thread starter Thread starter PT
  • Start date Start date
P

PT

I scanned an article to a PDF file. Then I used the Nuance program
“PDF Convert” to OCR the PDF and save it to a Word 2003 file.
Everything worked fine – the text transferred accurately.

Here’s the problem

Each paragraph of the text is now enclosed in a separate text box,
making it impractical to edit.

Is there a simple way to extricate the text from the text boxes and
end up with a normal. God-fearing Word document.
 
If that is really Frames rather than Text Boxes, then there is a
RemoveFrames command listed under All Commands. The command can be added to
a toolbar but if you need further instructions, we need to know which
version of Word you are using.
 
I did some checking. The box has the same borders as a text box.
It’s a crosshatched border with small circular “handles”

But just in case, I accessed the “Remove Frames” command and put it on
the toolbar. But as soon as I click in a “frame”, the toolbar command
grays out.

So assuming the converter put each paragraph into a separate text box,
how can I remove the boxes, while retaining the enclosed text?
 
Round handles are text boxes; frames have square ones. You can convert the
text boxes to frames and then use Remove Frame. In fact, just pressing
Ctrl+Q will usually remove a frame, since the frame is unlikely to be
defined as part of the paragraph formatting. But you will also lose any
other directly applied paragraph formatting.

--
Suzanne S. Barnhill
Microsoft MVP (Word)
Words into Type
Fairhope, Alabama USA
http://word.mvps.org

I did some checking. The box has the same borders as a text box.
It’s a crosshatched border with small circular “handles”

But just in case, I accessed the “Remove Frames” command and put it on
the toolbar. But as soon as I click in a “frame”, the toolbar command
grays out.

So assuming the converter put each paragraph into a separate text box,
how can I remove the boxes, while retaining the enclosed text?
 
Most OCR software makes a complete hash of converting to Word. Text boxes
and frames are typical examples. OK you get a Word document but the document
is not editable without a lot of work. Finereader 9 works better than most,
but perfect it isn't.

You might have been better scanning into Microsoft Office Document Imaging
(included with 2003, though not installed by default). This will not cause
the formatting issues, because there will be no formatting in the resulting
file, but its text reading ability is reasonably good.

--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP

My web site www.gmayor.com

<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
 
Back
Top