ascii

  • Thread starter Thread starter Frank
  • Start date Start date
F

Frank

Hi,
what IO class or encoding writes a + umlaut (ä) as 1 character to a file?
Looks like streamwriters don't, should I use binarywriter?
Thanx in advance
Frank
 
The umlaut character is a single byte character, but strings in .NET are by default Unicode strings. Maybe try using the ASCII encoder(look at the ASCIIEncoding class) and then writing to the stream.

Rgds,
Anand
VB.NET MVP
http://www.dotnetindia.com
 
No, the ascii encoding only takes 7 bits, for a + umlaut you need 8 bits.
Frank
Anand said:
The umlaut character is a single byte character, but strings in .NET are
by default Unicode strings. Maybe try using the ASCII encoder(look at the
ASCIIEncoding class) and then writing to the stream.
 
Looks like streamwriters don't, should I use binarywriter?

StreamWriter should work find as long as you initialize it with the
appropriate Encoding.



Mattias
 
No, it doesn't. a + umlaut is written something like Ã".

So which encoding are you using, and which tool are you then checking
the result with?



Mattias
 
* "Frank said:
No, the ascii encoding only takes 7 bits, for a + umlaut you need 8 bits.
Frank

Depending on where you live, pass 'Encoding.Default' to the
'StreamWriter's constructor.
 
* "Frank said:
I tried several encoding settings in streamwriter. None gave me a+umlaut.

What application to you use to take a look at the file?
 
Frank,
In addition to the other comments.

Are you certain that you have an a + umlaut (ä) in your VB.NET program?

Remember that characters in .NET are Unicode (not ASCII or ANSI) which means
I would consider using ChrW to get an a + umlaut (ä) in my program,
something like:

Const umlautA As Char = ChrW(&HE4)


Then using the correct encoding (more then likely Encoding.Default) or using
Encoding.GetEncoding(), the a + umlaut (ä) will be written to the file
correctly.

Dim output As New StreamWriter("umlautA.txt", False,
System.Text.Encoding.Default)
output.Write("This is a ")
output.Write(umlautA)
output.Write(" an a + umlaut (ä)")
output.WriteLine()
output.Close()

Also as you pointed ASCII is 7 bits, you really want a specific ANSI (8 bit)
encoding in your file. Encoding.Default is the ANSI encoding for the
regional settings you have in windows.

For information on ASCII, ANSI, Unicode and how Encoding works see:

http://www.yoda.arachsys.com/csharp/unicode.html

Hope this helps
Jay

Frank said:
No, it doesn't. a + umlaut is written something like Ã".
Frank
 
Frank,
does it matter?
Yes it does!

If you are writing the file in Windows 1252 encoding, but then attempt to
read the file in the "OEM US encoding" an umlauted a will no longer be an
umlauted a.

Or worse, if you are writing it in UTF8 (the default for the StreamWriter)
and attempting to read it under Windows 1252, again an umlauted a will no
longer be an umlauted a.

Hope this helps
Jay

Frank said:
does it matter? I checked with a hexeditor.
Frank
 
Hi Frank,

I did not check it at all, the others are so far in helping you, so some
extra information to them..

You know that half of the computers in Holland are installed with the 850
code system and the other half with 437 (Called US however exactly what we
need in Holland).

I thought that it should not make sense with your problem however you never
know, it is not my favorite part.

Cor
 
Cor,
Those sound like DOS or OEM code pages, which normally are not the same as
the Window code page...

850 = Western European (DOS) encoding
437 = OEM United States encoding

1252 = Western European (Windows)

You can use System.Text.Encoding.Default to see what you Windows encoding is
going to be, I suspect it will be 1252 - Western European (Windows).

You can use
System.Text.Encoding.GetEncoding(System.Globalization.CultureInfo.CurrentCul
ture.TextInfo.OEMCodePage) to get your DOS or OEM encoding. Which in Holland
I would expect to be 850 or 437 based on what you stated. In the US I would
expect 437.

DOS verses Windows code page, in addition to what you stated, I suspect is
the root of Frank's problem...

Just a thought
Jay
 
Doh!
You can use System.Text.Encoding.Default to see what you Windows encoding is
going to be, I suspect it will be 1252 - Western European (Windows).
I would expect to be 1252 in the US & Holland, as well as most of Europe.

Jay
 
* "Frank said:
does it matter? I checked with a hexeditor.

Yes, it matters, because it depends on the interpretation of the bytes
containing the data what text is displayed. If the hex editor
interprets every byte (...) as ASCII character, then you won't see any
umlauts. Do the umlauts show up if you load the file in notepad?
 
Yes,
encoding.default works. I misinterpreted the meaning of 'default'. And the
vb doc doesn't mention it in the remarks of the decoding class.
Thanks all


Jay B. Harlow said:
Frank,
In addition to the other comments.

Are you certain that you have an a + umlaut (ä) in your VB.NET program?

Remember that characters in .NET are Unicode (not ASCII or ANSI) which means
I would consider using ChrW to get an a + umlaut (ä) in my program,
something like:

Const umlautA As Char = ChrW(&HE4)


Then using the correct encoding (more then likely Encoding.Default) or using
Encoding.GetEncoding(), the a + umlaut (ä) will be written to the file
correctly.

Dim output As New StreamWriter("umlautA.txt", False,
System.Text.Encoding.Default)
output.Write("This is a ")
output.Write(umlautA)
output.Write(" an a + umlaut (ä)")
output.WriteLine()
output.Close()

Also as you pointed ASCII is 7 bits, you really want a specific ANSI (8 bit)
encoding in your file. Encoding.Default is the ANSI encoding for the
regional settings you have in windows.

For information on ASCII, ANSI, Unicode and how Encoding works see:

http://www.yoda.arachsys.com/csharp/unicode.html

Hope this helps
Jay
 
Hi Herfried,

Are you living in Western Europe. I thought that were the Benelux, France,
the western Scandinavians, Eire, the UK, Spain and Portugal.

However things can change. Even for Orson Wells you where living deep in
Central Europe.

And before you are in doubt with me it is 1252, however I keep thinking when
I see those problems from Frank forever to those problems I had with that
850 and 437 page in past here.

:-)

Cor
 
Frank,
I misinterpreted the meaning of 'default'.
Understandable. Unfortunately it is confusing...

As the "default" for StreamReader & StreamWriter is UTF8Encoding, which
works well for ASP.NET apps where you don't know the local of the requestor.

While Encoding.Default is your Windows Code page, which works "better" for
Windows Forms, Windows Services & Console applications, especially if you
open the file in NotePad.

Hope this helps
Jay
 
Back
Top