vanderghast said:
I have heard of such characters (Swedish/Finnish come to mind, but can be
wrong), but fortunately, for now, there are not part of the targeted
"cultures".
I'm not sure you're getting what I mean. For example: the letter 'é' can
be represented as a combining accent followed by a plain e (in UTF-16,
0x0301 followed by 0x0065) or as a single Unicode character (in UTF-16,
0x00e9).
It's not a language-specific issue.
That said, I wrote a quick test (see below), and discovered that the
IgnoreNonSpace option actually does more work than the documentation
describes. In particular, it appears to actually handle the situation
you're specifically dealing with, by treating \u00e9 as the same as \u0065
(in addition to ignoring the non-space \u0301 character, had I included
that).
Whether it's safe to rely on this undocumented behavior, I'm not entirely
sure. However, Microsoft has always held backward-compatibility as a high
priority, and even if the current behavior was eventually deemed incorrect
according to their specification, I'd be really surprised if they changed
it due to the potential of breaking lots of existing code. It's probably
more likely they'd update the specification and documentation.
So, in other words, ignore what I wrote about the difference between
combining characters and individual accented characters. I mean, don't
ignore the specific data, but do ignore my conclusion based on the data as
it applies to your string comparison scenario.
Pete
using System;
using System.Globalization;
namespace TestAccentCompare
{
class Program
{
static void Main(string[] args)
{
string strAccented = "\u00e9", strPlain = "\u0065";
Console.WriteLine("'{0}' == '{1}': {2} (CompareOptions.None)",
strAccented, strPlain, (String.Compare(strAccented,
strPlain, CultureInfo.CurrentCulture,
CompareOptions.None) == 0).ToString());
Console.WriteLine("'{0}' == '{1}': {2}
(CompareOptions.IgnoreNonSpace)",
strAccented, strPlain, (String.Compare(strAccented,
strPlain, CultureInfo.CurrentCulture,
CompareOptions.IgnoreNonSpace) == 0).ToString());
Console.ReadLine();
}
}
}