Johannes said:
Is it correct that Unicode characters with code points above 0x10FFFF
are not supported by C#?
Which code points are those? You'll have a harder time supporting
characters over 0xffff in .NET as you need surrogate pairs, etc, but I
*thought* everything was within 0-0x10ffff still. (That does, after
all, give a pretty huge scope.) Has that situation changed?
I have a hard time believing this since it would eliminate some Asian
languages. If it is true, is there a workaround? Do other .NET
languages support code points > 0x10FFFF?
It's not really a language issue - .NET itself represents the character
type as a 16 bit entity, as to display Unicode characters outside plane
0 you need to use surrogates and check that whatever you're using to
display them (etc) supports surrogates properly. C# has the \U (as
opposed to \u) escaping for characters above 0xffff, within strings -
and those are then represented as a surrogate pair. That's the only
specific language support I know of in C# for characters outside plane
0, but I would imagine it's probably enough. Most of the work needs to
be done by .NET itself.