How to read a text file with wide characters?

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have a text file with wide characters. I use the following C++ code to read
them in. However the wide characters are not read in properly. What is wrong?

String* path = "C:\\Documents and Settings\\kst\\BE.dat";

try
{
FileStream* fs = new FileStream(path, FileMode::Open);
StreamReader* sr = new StreamReader(fs);

int count = 0;
while (sr->Peek() >= 0)
{
count++;
Debug::Write(__box(count));
Debug::WriteLine(__box((Char)sr->Read()), " ");
}
}
 
I have a text file with wide characters. I use the following C++ code to
read
them in. However the wide characters are not read in properly. What is
wrong?

String* path = "C:\\Documents and Settings\\kst\\BE.dat";

try
{
FileStream* fs = new FileStream(path, FileMode::Open);
StreamReader* sr = new StreamReader(fs);

int count = 0;
while (sr->Peek() >= 0)
{
count++;
Debug::Write(__box(count));
Debug::WriteLine(__box((Char)sr->Read()), " ");
}
}

StreamReader defaults to UTF-8 encoding. If you want to use UTF-16 or
Unicode (or any other encoding, for that matter) you need to especify it in
the constructor. Example:

using System::Text;
StreamReader* sr = new StreamReader(fs, Encoding::Unicode);

Also, why are you reading it char by char when you have the full
capabilities of the StreamReader at your disposal?
 
What Encoding should I use if the text file contains both Ascii (one byte)
and Chinese characters (2 bytes)? I try Unicode, BigEndianUnicode, ASCII,
UTF8, UTF7 (the only choices I can see in the Encoding class member), none of
them work.
 
Kueishiong Tu said:
What Encoding should I use if the text file contains both Ascii (one
byte)
and Chinese characters (2 bytes)? I try Unicode, BigEndianUnicode, ASCII,
UTF8, UTF7 (the only choices I can see in the Encoding class member), none
of
them work.

You need to use Encoding::GetEncoding() to find the appropriate encoding.
The ASCII, UTF8, ... members of the Encoding class are just shortcuts for
the most commonly used encodings.

In your case, you probably want something like
Encoding::GetEncoding("big5").

-cd
 
Carl:
You are absolutely right. Thank you very much for your and Tomas's help.

Kueishiong Tu
 
Back
Top