German "Umlaute" in QueryString

  • Thread starter Thread starter Uwe Braunholz
  • Start date Start date
U

Uwe Braunholz

Hello,

working on a asp.net Website brought me to a strange problem.

I want to enable my users to pass a search string via the query string
of an url.

It works if the user calls the URL like "default.aspx?search=mystring"
But as soon as a German umlaut is in the query string it does not
handle the char correctly (like "default.aspx?search=müstring" will be
displayed as "m?string" on the page).

I notice that firefox converts the URL automatically to:
?search=m%FCstring
using the debugger results in:
Request.QueryString = {search=m%ufffdstring}
the display as label is:
m�string

If I call the page from another asp.net page using
Response.Redirect("Default.aspx?search=" +
Server.UrlEncode("müstring"));
the result is corretly, but the URL is changed to:
?search=m%c3%bcstring
debugging results in
Request.QueryString = {search=m%u00fcstring}

The problem is, that the users will call the URL from different
applications (not necessarily .NET) so Server.URLEncode will never be
called so the first case will occur.

How can I solve this?

Thank you!
Regards,
Uwe
 
Needs to be treated as unicode. Look at requestEncoding, responseEncoding
and the related configuration entries in MSDN under globalization node in
the web.config file. May throw some light.


Hello,

working on a asp.net Website brought me to a strange problem.

I want to enable my users to pass a search string via the query string
of an url.

It works if the user calls the URL like "default.aspx?search=mystring"
But as soon as a German umlaut is in the query string it does not
handle the char correctly (like "default.aspx?search=müstring" will be
displayed as "m?string" on the page).

I notice that firefox converts the URL automatically to:
?search=m%FCstring
using the debugger results in:
Request.QueryString = {search=m%ufffdstring}
the display as label is:
m?string

If I call the page from another asp.net page using
Response.Redirect("Default.aspx?search=" +
Server.UrlEncode("müstring"));
the result is corretly, but the URL is changed to:
?search=m%c3%bcstring
debugging results in
Request.QueryString = {search=m%u00fcstring}

The problem is, that the users will call the URL from different
applications (not necessarily .NET) so Server.URLEncode will never be
called so the first case will occur.

How can I solve this?

Thank you!
Regards,
Uwe
 
Hello Sriram,
thank you for your answer. I thougth so too, but the default seems to
be utf-8. If I set
<globalization requestEncoding="UTF-8" responseEncoding="UTF-8" />
in web.config, nothing changes. Only if I set a different encoding
like "ISO-8859-1". But what if another user enters some greek, russian
or chinese characters?

I think fixing my application to a specific encoding is not a good
idea. I thought utf8 would be the choice for that?

Regards,
Uwe
 
Hello,

working on a asp.net Website brought me to a strange problem.

I want to enable my users to pass a search string via the query string
of an url.

It works if the user calls the URL like "default.aspx?search=mystring"
But as soon as a German umlaut is in the query string it does not
handle the char correctly (like "default.aspx?search=müstring" will be
displayed as "m?string" on the page).

I notice that firefox converts the URL automatically to:
?search=m%FCstring
using the debugger results in:
Request.QueryString = {search=m%ufffdstring}
the display as label is:
m string

If I call the page from another asp.net page using
Response.Redirect("Default.aspx?search=" +
Server.UrlEncode("müstring"));
the result is corretly, but the URL is changed to:
?search=m%c3%bcstring
debugging results in
Request.QueryString = {search=m%u00fcstring}

The problem is, that the users will call the URL from different
applications (not necessarily .NET) so Server.URLEncode will never be
called so the first case will occur.

How can I solve this?

Thank you!
Regards,
Uwe

Hallo Uwe,

generally all umlauts must be encoded (as UTF-8 in your case).

So, if your search string value is from the Search textbox, you have
to encode it as

string search = HttpUtility.UrlEncode(SearchTextBox.Text,
Encoding.UTF8));
Response.Redirect("default.aspx?search=" + search);

this would redirect you to default.aspx?search=m%c3%bcstring

In this case you would not have any problem to get the value using
HttpUtility.UrlDecode() function...

If you want to get umlauts out of the "default.aspx?search=müstring"
use RawUrl:

string search = Request.RawUrl.Split('=')[1];

and then search would return you "müstring"...

To handle both requests

default.aspx?search=m%c3%bcstring
default.aspx?search=müstring

use

string search = Request.RawUrl.Split('=')[1];
string newsearch = HttpUtility.UrlDecode(search);
 
Hello Alexey,

thank you for your reply. Unfortunately even the RawUrl does not give
me the real (intended) URL-string. There still is a ? instead of the
"ü".
Debugging says
Request.QueryString = {search=m%ufffdstring}
Request.RawUrl = "/WebSite2/Default.aspx?search=m�string"

The problem is that I do not have any control over the search-source.
This could be any application where the url is put together
automatically.

Regards,
Uwe

working on a asp.net Website brought me to a strange problem.
I want to enable my users to pass a search string via the query string
of an url.
It works if the user calls the URL like "default.aspx?search=mystring"
But as soon as a German umlaut is in the query string it does not
handle the char correctly (like "default.aspx?search=müstring" will be
displayed as "m?string" on the page).
I notice that firefox converts the URL automatically to:
?search=m%FCstring
using the debugger results in:
Request.QueryString = {search=m%ufffdstring}
the display as label is:
m string
If I call the page from another asp.net page using
Response.Redirect("Default.aspx?search=" +
Server.UrlEncode("müstring"));
the result is corretly, but the URL is changed to:
?search=m%c3%bcstring
debugging results in
Request.QueryString = {search=m%u00fcstring}
The problem is, that the users will call the URL from different
applications (not necessarily .NET) so Server.URLEncode will never be
called so the first case will occur.
How can I solve this?
Thank you!
Regards,
Uwe

Hallo Uwe,

generally all umlauts must be encoded (as UTF-8 in your case).

So, if your search string value is from the Search textbox, you have
to encode it as

string search = HttpUtility.UrlEncode(SearchTextBox.Text,
Encoding.UTF8));
Response.Redirect("default.aspx?search=" + search);

this would redirect you to default.aspx?search=m%c3%bcstring

In this case you would not have any problem to get the value using
HttpUtility.UrlDecode() function...

If you want to get umlauts out of the "default.aspx?search=müstring"
use RawUrl:

string search = Request.RawUrl.Split('=')[1];

and then search would return you "müstring"...

To handle both requests

default.aspx?search=m%c3%bcstring
default.aspx?search=müstring

use

string search = Request.RawUrl.Split('=')[1];
string newsearch = HttpUtility.UrlDecode(search);
 
Hello Alexey,

thank you for your reply. Unfortunately even the RawUrl does not give
me the real (intended) URL-string. There still is a ? instead of the
"ü".
Debugging says
Request.QueryString = {search=m%ufffdstring}
Request.RawUrl = "/WebSite2/Default.aspx?search=m string"

Well, but where did you get the "m%ufffdstring"?

%ufff is wrong there... you can simply check it by typing "müstring"
in google. When UTF8 is used you will see that url is encoded as "m
%C3%BCstring"

http://www.google.com/search?hl=en&q=müstring

and not

http://www.google.com/search?hl=en&q=m%ufffdstring
 
Hello Alexey,

the "m%ufffdstring" appears, if I hover the variable in my Visual
Studio 2005 with my mouse.

Regards,
Uwe
 
Ok, I think the problem is about that:http://www.captain.at/howto-php-urlencode-javascript-decodeURICompone...

Assuming, I will never be able to pass a regular ü to the server by
simply put it into the url:
The Server.UrlEncode converts ü to %C3%BC
But my browsers convert ü to %FC
Why is this? How do I know what I will get on the serverside?

Regards,
Uwe






- Show quoted text -

Uwe,

you used Firefox 2 to get the "ü" converted into "%FC", right? While
FF2 sends it by default in ISO-8859-1, FF3 and IE send it in UTF-8 and
that means that you cannot get "ü" working in FF2 only. But in FF2 you
can change this behavior too. Go to about:config, and set
"network.standard-url.encode-utf8" parameter to true.

If you want to make umlauts working in all browsers I guess you would
need to check if letter has 1 or 2 bytes encoded. But the easiest way
is probably to use a function as the one below:

string convertLetters(string s)
{
string[] oldchars = new string[] {"%FC","%C3%BC", ...};
string[] newchars = new string[] {"ü","ü", "ä", "ö"...};

for (i==0;i<oldchars.length();i++)
{
if (s.indexOf(oldchars)>-1)
{
s = s.replace(oldchars,newchars);
}
}

return s;
}
 
Hello Alexey,

thanks for keeping up on this!
I am using FF3, but the settings was false, so I assume the behavior
did not change. As I switched to true, it worked.
On my testmachine I use IE6 as well, and this one does not seem to use
UTF8 also, because the result is the same.

How could I check if the string is encoded with one or two bytes?

Regards,
uwe

Assuming, I will never be able to pass a regular ü to the server by
simply put it into the url:
The Server.UrlEncode converts ü to %C3%BC
But my browsers convert ü to %FC
Why is this? How do I know what I will get on the serverside?
- Show quoted text -

Uwe,

you used Firefox 2 to get the "ü" converted into "%FC", right? While
FF2 sends it by default in ISO-8859-1, FF3 and IE send it in UTF-8 and
that means that you cannot get "ü" working in FF2 only. But in FF2 you
can change this behavior too. Go to about:config, and set
"network.standard-url.encode-utf8" parameter to true.

If you want to make umlauts working in all browsers I guess you would
need to check if letter has 1 or 2 bytes encoded. But the easiest way
is probably to use a function as the one below:

string convertLetters(string s)
{
string[] oldchars = new string[] {"%FC","%C3%BC", ...};
string[] newchars = new string[] {"ü","ü", "ä", "ö"...};

for (i==0;i<oldchars.length();i++)
{
if (s.indexOf(oldchars)>-1)
{
s = s.replace(oldchars,newchars);

}
}

return s;

}
 
Hello Alexey,

thanks for keeping up on this!
I am using FF3, but the settings was false, so I assume the behavior
did not change. As I switched to true, it worked.
On my testmachine I use IE6 as well, and this one does not seem to use
UTF8 also, because the result is the same.

How could I check if the string is encoded with one or two bytes?

Regards,
uwe

you used Firefox 2 to get the "ü" converted into "%FC", right? While
FF2 sends it by default in ISO-8859-1, FF3 and IE send it in UTF-8 and
that means that you cannot get "ü" working in FF2 only. But in FF2 you
can change this behavior too. Go to about:config, and set
"network.standard-url.encode-utf8" parameter to true.
If you want to make umlauts working in all browsers I guess you would
need to check if letter has 1 or 2 bytes encoded. But the easiest way
is probably to use a function as the one below:
string convertLetters(string s)
{
string[] oldchars = new string[] {"%FC","%C3%BC", ...};
string[] newchars = new string[] {"ü","ü", "ä", "ö"...};
for (i==0;i<oldchars.length();i++)
{
if (s.indexOf(oldchars)>-1)
{
s = s.replace(oldchars,newchars);

return s;

}- Hide quoted text -

- Show quoted text -


Well, I think you would need to check if querystring has %C3%XX (UTF
encoded), or just %XX was presented.

Sample code:

string s;

s = "m%C3%BCstring";
s = "m%FCstring";

if (s.IndexOf("%C3", StringComparison.InvariantCultureIgnoreCase) >
-1)
s = HttpUtility.UrlDecode(s, Encoding.UTF8);
else
s = HttpUtility.UrlDecode(s, Encoding.GetEncoding("ISO-8859-1"));

Hope this helps,

Tch%C3%BCss :-)
 
Hello Alexey,

thank you once more!
Unfortunately, the UrlDecode with ISO-8859-1 does not give any good
sign, only the small "char not displayable box".
I wonder if this is related to the debugoutput:
Request.QueryString = {search=m%ufffdstring} instead of something like
m%FCstring shown in the browser addressbar.

As it seems like I am the only one on the web experiencing this, I
leave it there. I have no idea why such a problem can exist at all.

Regards,
uwe
Hello Alexey,
thanks for keeping up on this!
I am using FF3, but the settings was false, so I assume the behavior
did not change. As I switched to true, it worked.
On my testmachine I use IE6 as well, and this one does not seem to use
UTF8 also, because the result is the same.
How could I check if the string is encoded with one or two bytes?
Regards,
uwe

Ok, I think the problem is about that:http://www.captain.at/howto-php-urlencode-javascript-decodeURICompone...
Assuming, I will never be able to pass a regular ü to the server by
simply put it into the url:
The Server.UrlEncode converts ü to %C3%BC
But my browsers convert ü to %FC
Why is this? How do I know what I will get on the serverside?
Regards,
Uwe
Hello Alexey,
thank you for your reply. Unfortunately even the RawUrl does not give
me the real (intended) URL-string. There still is a ? instead of the
"ü".
Debugging says
Request.QueryString = {search=m%ufffdstring}
Request.RawUrl = "/WebSite2/Default.aspx?search=m string"
Well, but where did you get the "m%ufffdstring"?
%ufff is wrong there... you can simply check it by typing "müstring"
in google. When UTF8 is used you will see that url is encoded as "m
%C3%BCstring"
http://www.google.com/search?hl=en&q=müstring
and not
http://www.google.com/search?hl=en&q=m%ufffdstring-Hidequotedtext -
- Show quoted text -
Uwe,
you used Firefox 2 to get the "ü" converted into "%FC", right? While
FF2 sends it by default in ISO-8859-1, FF3 and IE send it in UTF-8 and
that means that you cannot get "ü" working in FF2 only. But in FF2 you
can change this behavior too. Go to about:config, and set
"network.standard-url.encode-utf8" parameter to true.
If you want to make umlauts working in all browsers I guess you would
need to check if letter has 1 or 2 bytes encoded. But the easiest way
is probably to use a function as the one below:
string convertLetters(string s)
{
string[] oldchars = new string[] {"%FC","%C3%BC", ...};
string[] newchars = new string[] {"ü","ü", "ä", "ö"...};
for (i==0;i<oldchars.length();i++)
{
if (s.indexOf(oldchars)>-1)
{
s = s.replace(oldchars,newchars);
}
}
return s;
}- Hide quoted text -

- Show quoted text -

Well, I think you would need to check if querystring has %C3%XX (UTF
encoded), or just %XX was presented.

Sample code:

string s;

s = "m%C3%BCstring";
s = "m%FCstring";

if (s.IndexOf("%C3", StringComparison.InvariantCultureIgnoreCase) >
-1)
s = HttpUtility.UrlDecode(s, Encoding.UTF8);
else
s = HttpUtility.UrlDecode(s, Encoding.GetEncoding("ISO-8859-1"));

Hope this helps,

Tch%C3%BCss :-)
 
Back
Top