Not really a waste of memory cells at all. In fact, if you expect to be
dealing with HTML, you owe it to yourself and others to commit this kind
of knowledge to your memory cells.
Do you really think knowing about how a browser interprets a character
entity, how it decides whether to print it or not, is really something
people need to know every time they touch HTML? I believe in the basic
principle that no one should need to know _everything_ about a tool in
order to use it. This forum really is proof that you _don't_ need to
know everything about C# in order to program in it. If you did need to
know everything, only people like yourself would even stand a chance
of using it. The only thing people like myself require is a place to
find answers to those questions on that rare occasion when things are
black and white. Hence documentation and forums like this one.
I think my understanding of HTML is thorough, but far from complete.
Most programmers with a few web applications under their belts should
at least know about HTML and probably CSS. But you have to also
realize that a person programming in ASP .NET for years may be capable
of getting along using Visual Studio's designer and never once worry
about HTML tags. I think that is a good thing, IMO. Sure, when that
rare occasion comes up, that person may be left in the dark. But why
know HTML until it you need to? Who knows, in 20 years, knowing HTML
will be like knowing system calls in most programs today - most people
never care to look at code at that level.
I don't consider web development as my focus. I work mostly on Windows
Forms applications, database-drive server applications and systems
architecture. Sure, I have made some pretty large web applications,
recently, but ASP .NET hid most of the HTML from me. I leave user
interfaces up to other developers who are specialized in user
interface beautification. If every programmer on the team needs to
know how to do every task perfectly, you're not being efficient.
Unfortunately, my lead and I are the only programmers at work, since
my peer left a few months ago. He was the artist; I believe I am best
utilized elsewhere.
This isn't an issue about "how browsers interpret Unicode". It's an issue
about what putting " ", or any character entity, into your HTML
_really_ does, as well as what is the definition of "white space" in the
context of HTML.
The named character entities really are just a expression interpreted as a
specific character. Thus the name, "character entities". When you use a
named character entity like that, you are in effect saying "put character
<foo> here", where <foo> is the character point for the entity (regardless
of character encoding). The reason doesn't disappear as white
space isn't because you wrote it " ", it's because that specific
character, '\u00A0', isn't among the kinds of characters defined to be
"white space" in HTML. It doesn't matter how you get that character into
the HTML, it's not white space regardless.
And see, that is where I overlooked your response. It was my fault for
doing so. I should have tried it before discarding it. I'm sorry for
doing that. Unfortunately as early in development as I was, I didn't
want to do a lot of work to test your suggestion. It wouldn't have
been that much work to test that, so it was me being lazy at 4pm on a
Friday.
For more information, see:
http://www.w3.org/TR/html4/sgml/ent...pedia.org/wiki/SGML_entity#Character_entities
And next time someone suggests a solution, please just try it before you
state it won't work. Especially if you can't be bothered to post any code
showing what you're actually doing, you owe it to those trying to help you
to test solutions they offer before you deem them unusable.
Sorry, it is hard to gut a multi-file code set and present it as a
short example. This framework I created involved 26 source files. The
heart of the example would have been something like this, though:
public static string PrepareText(string text)
{
text = text.Replace(' ', '\u00A0');
text = text.Replace("\t", "\u00A0\u00A0\u00A0\u00A0");
return text;
}
Of course, this misses the code that places the text in an XmlText
object, and the code to add that XmlText to the parent tag. And that
code is part of another class, which appears in multiple places
throughout the code. I would have had to pretty much make another,
smaller, code set just to make a concise example.
In fact, I had to change my approach quite a bit to make the example
above that simple. Originally, I was trying to replace
Environment.NewLine with <br />. I ended up creating a HtmlNewLineItem
decorator that I wrapped around IHtmlReportItem, so that whenever I
created the tag for the HTML item, the decorator would follow it with
a br element. Creating a decorator for something like that seems quite
excessive in hind-sight. I provide a method called AppendLine, which
handles the creation of new-lines -- I don't even worry about the user
entering in Environment.NewLine manually! Otherwise, I would have to
split the text and replace each new-line with a br element. The amount
of work it takes to make this all work -- it might have been easier to
not use .NET's classes at all!
If it means anything, I am sorry for not taking the time to create
example. I am sorry that I didn't take the time to try your advice.
You have helped me a lot in the past. Thanks for all your help.