Displaying international charachters

  • Thread starter Thread starter Nightcrawler
  • Start date Start date
N

Nightcrawler

I have a website that does the following:

1. it accepts a keyword through a textbox in the UI
2. once the submit button is clicked it goes out and spiders a few
websites using the keyword supplied
3. it converts the returned html to xml
4. it uses LINQ to query the html page and stores the results in a
database table
5. it then pulls the results from the database using a LINQ query ad
displays them on a webpage

The problem I am facing is that I am not able to display international
charachters (charachters not in the english alphabet) in step 5

When I spider the pages in step 2 I use Encoding.UTF7. Using UTF7
successfully allows me to capture international charachters. When I
commit the data to the database in step 4 I can see that the data is
stored and displayed correctly when I view the table in Enterprise
Manager.

When I query the database for the results in step 5 and try to diplay
them on a aspx page, they show up as ù

What am I doing wrong?
 
Nightcrawler said:
I have a website that does the following:

1. it accepts a keyword through a textbox in the UI
2. once the submit button is clicked it goes out and spiders a few
websites using the keyword supplied
3. it converts the returned html to xml
4. it uses LINQ to query the html page and stores the results in a
database table
5. it then pulls the results from the database using a LINQ query ad
displays them on a webpage

The problem I am facing is that I am not able to display international
charachters (charachters not in the english alphabet) in step 5

When I spider the pages in step 2 I use Encoding.UTF7. Using UTF7
successfully allows me to capture international charachters. When I
commit the data to the database in step 4 I can see that the data is
stored and displayed correctly when I view the table in Enterprise
Manager.

When I query the database for the results in step 5 and try to diplay
them on a aspx page, they show up as ù

What am I doing wrong?

Hi,

I suspect you aren't formatting the UTF7-text properly when reading from the
database or before displaying on the web page. As UTF7 poses a security risk
you should consider using UTF8 or another encoding instead.
 
How do I format text when reading from a database? I use a simple LINQ
query to pull the data through a stored procedure. Can you point me in
the right direction as to how to do that?

Much appreciated.

Thanks
 
Nightcrawler said:
How do I format text when reading from a database? I use a simple LINQ
query to pull the data through a stored procedure. Can you point me in
the right direction as to how to do that?

Much appreciated.

Thanks

You may need to convert the text you get to another encoding. To
demonstrate the problem the code below will write "2 + 2 = 4" in utf7 to a
file. The default reading method will however use utf8 and you will end up
with "2 +- 2 +AD0- 4"

File.WriteAllText("C:\\Temp\\utf7.txt", "2 + 2 = 4", Encoding.UTF7);
string utf7Text = File.ReadAllText("C:\\Temp\\utf7.txt");

byte[] data = Encoding.UTF8.GetBytes(utf7Text);
string unicode = Encoding.UTF7.GetString(data);
 
Back
Top