C
Chris Mullins
I'm implementing RFC 3491 in .NET, and running into a strange issue.
Step 1 of RFC 3491 is performing a set of mappings dicated by tables B.1 and
B.2.
I'm having trouble with the following mappings though, and it seems like a
shortcoming of the .NET framework:
When I see Unicode value 0x10400, I'm supposed to map it to value 0x10428.
This list goes on (the left colulmn is the existing value, the right column
is the replacement value):
(values are in HEX)
10400; 10428; Case map
10401; 10429; Case map
10402; 1042A; Case map
10403; 1042B; Case map
10404; 1042C; Case map
10405; 1042D; Case map
10406; 1042E; Case map
10407; 1042F; Case map
10408; 10430; Case map
(... and on for another few thousand lines...)
I've got the strings loaded into a StringBuilder, and am iterating through
it one character at a time, and comparing the character value to the mapping
values. The problem is that a Character cannot have a value greater than
0xFFFF. Both UTF8 and UTF16 encodings of Unicode 3.2 allow for values
larger than 0xFFFF.
Is there a workaround to this approach that I can use, or do I have to
convert everything to Bytes and do this the hard way?
Step 1 of RFC 3491 is performing a set of mappings dicated by tables B.1 and
B.2.
I'm having trouble with the following mappings though, and it seems like a
shortcoming of the .NET framework:
When I see Unicode value 0x10400, I'm supposed to map it to value 0x10428.
This list goes on (the left colulmn is the existing value, the right column
is the replacement value):
(values are in HEX)
10400; 10428; Case map
10401; 10429; Case map
10402; 1042A; Case map
10403; 1042B; Case map
10404; 1042C; Case map
10405; 1042D; Case map
10406; 1042E; Case map
10407; 1042F; Case map
10408; 10430; Case map
(... and on for another few thousand lines...)
I've got the strings loaded into a StringBuilder, and am iterating through
it one character at a time, and comparing the character value to the mapping
values. The problem is that a Character cannot have a value greater than
0xFFFF. Both UTF8 and UTF16 encodings of Unicode 3.2 allow for values
larger than 0xFFFF.
Is there a workaround to this approach that I can use, or do I have to
convert everything to Bytes and do this the hard way?