Convert text encoded with character referense ({) to unicode or uft-8

  • Thread starter Thread starter Daniel Köster
  • Start date Start date
D

Daniel Köster

Is there someone who has got some tips on how to convert text encoded with
character referense ({) to unicode or uft-8 format using VB.net? Is
there a function or something that can help with the conversion?

To use a simple replace "this" with "that" is not an option since there are
som asian-texts that I need to convert as well. (chinese, thai and
japanese;
the replace list would be to large to handle)

What i want to do is to be able to compare a file coded with character
references (i.e. {) with a file coded with normal unicode characters
(i.e. ö,ä,å)

Best regards
Daniel
 
Daniel Köster said:
Is there someone who has got some tips on how to convert text encoded with
character referense ({) to unicode or uft-8 format using VB.net? Is
there a function or something that can help with the conversion?

To use a simple replace "this" with "that" is not an option since there are
som asian-texts that I need to convert as well. (chinese, thai and
japanese;
the replace list would be to large to handle)

What i want to do is to be able to compare a file coded with character
references (i.e. {) with a file coded with normal unicode characters
(i.e. ö,ä,å)

Just do "normal" parsing to find the to start with, then use
Substring (or whatever) to get the xxx bit, parse it as an integer
(Int32.Parse or Convert.ToInt32) and cast the result to a character.
 
Just do "normal" parsing to find the to start with, then use
Substring (or whatever) to get the xxx bit, parse it as an integer
(Int32.Parse or Convert.ToInt32) and cast the result to a character.

HttpUtility.HtmlDecode
HttpUtility.HtmlEncode
 
Back
Top