A
apprentice
Hello,
I'm writing an class library that I imagine people from different countries
might be interested in using, so I'm considering what needs to be provided
to support foreign languages, including asian languages (chinese, japanese,
korean, etc).
First of all, strings will be passed to my class methods, some of which
based on the language (and on the encoding) might contain characters that
require more that a single byte.
Having to cycle through each byte composing each char of an input string,
how does .NET guarantee that the string is broken up correctly in its
composing chars based on the string's language??? In other words, how does
..NET identify the correct "boundary" for each char (what bytes are part of
each char) based on the string's language??? Also, what is the encoding with
which are strings initially taken into memory??? Does this encoding depend
from the culture set for the current thread or does it maybe depend from the
encoding for the system's current ANSI code page??? Is there a way to set
the encoding that .NET should be using for strings so that when cycling
through the characters in the string, bytes are correctly assigned to each
char based on the string's language???
Regards,
Bob Rock
I'm writing an class library that I imagine people from different countries
might be interested in using, so I'm considering what needs to be provided
to support foreign languages, including asian languages (chinese, japanese,
korean, etc).
First of all, strings will be passed to my class methods, some of which
based on the language (and on the encoding) might contain characters that
require more that a single byte.
Having to cycle through each byte composing each char of an input string,
how does .NET guarantee that the string is broken up correctly in its
composing chars based on the string's language??? In other words, how does
..NET identify the correct "boundary" for each char (what bytes are part of
each char) based on the string's language??? Also, what is the encoding with
which are strings initially taken into memory??? Does this encoding depend
from the culture set for the current thread or does it maybe depend from the
encoding for the system's current ANSI code page??? Is there a way to set
the encoding that .NET should be using for strings so that when cycling
through the characters in the string, bytes are correctly assigned to each
char based on the string's language???
Regards,
Bob Rock