C
Co
Hi All,
I use a code that creates a FileStream to open and read the content of
a word document.
I want to save the text as plain text to a database.
Now I have a code that reads UTF-8 encoding but that doesn't always
work:
Sub readdoc2(ByVal sPathName As String)
Dim temp As UTF8Encoding = New UTF8Encoding(True)
Dim fs As FileStream = File.OpenRead(sPathName)
Dim b(1024) As Byte
Do While fs.Read(b, 0, b.Length) > 0
Me.RichTextBox1.Text &= temp.GetString(b, 0, b.Length)
Loop
fs.Close()
End Sub
Some documents need my other code:
Sub readdoc(ByVal sPathName As String)
Dim fs As FileStream = File.OpenRead(sPathName)
Dim d As New StreamReader(fs)
'creating a new StreamReader and passing the filestream object
fs as argument
d.BaseStream.Seek(0, SeekOrigin.Begin)
'Seek method is used to move the cursor to different positions
in a file, in this code, to
'the beginning
While d.Peek() > -1
'peek method of StreamReader object tells how much more
data is left in the file
Me.RichTextBox1.Text &= d.ReadLine()
End While
d.Close()
End Sub
Anyway I end up with some strange characters which I first have to
remove before I can save the
text to the database.
Is there no way you can get the text from a document without having to
remove these unreadable
characters?
Regards
Marco
The Netherlands
I use a code that creates a FileStream to open and read the content of
a word document.
I want to save the text as plain text to a database.
Now I have a code that reads UTF-8 encoding but that doesn't always
work:
Sub readdoc2(ByVal sPathName As String)
Dim temp As UTF8Encoding = New UTF8Encoding(True)
Dim fs As FileStream = File.OpenRead(sPathName)
Dim b(1024) As Byte
Do While fs.Read(b, 0, b.Length) > 0
Me.RichTextBox1.Text &= temp.GetString(b, 0, b.Length)
Loop
fs.Close()
End Sub
Some documents need my other code:
Sub readdoc(ByVal sPathName As String)
Dim fs As FileStream = File.OpenRead(sPathName)
Dim d As New StreamReader(fs)
'creating a new StreamReader and passing the filestream object
fs as argument
d.BaseStream.Seek(0, SeekOrigin.Begin)
'Seek method is used to move the cursor to different positions
in a file, in this code, to
'the beginning
While d.Peek() > -1
'peek method of StreamReader object tells how much more
data is left in the file
Me.RichTextBox1.Text &= d.ReadLine()
End While
d.Close()
End Sub
Anyway I end up with some strange characters which I first have to
remove before I can save the
text to the database.
Is there no way you can get the text from a document without having to
remove these unreadable
characters?
Regards
Marco
The Netherlands