Hi Ludwig,
We're getting closer, but remember that close only counts in horseshoes and
hand-grenades, not in programming!
To use Regular Expressions, you must be *absolutely specific* about your
rules.
Let me explain what is missing here. "Syntax" means nothing to Regular
Expressions, and very little to humans. That is, it can refer to so many
different things (such as the "syntax" I'm using to write this post) that it
identifies nothing in and of itself.
That is what you think you mean, but that is not what you mean. For example,
note the 2 uses of "public" in the following example:
public string Opened()
{
return "Open to the public";
}
Now, the first instance of "public" is syntax, but the second is part of a
string. In other words, "syntax" is a set of rules. From Dictionary.com,
"syntax" means:
"The rules governing the formation of statements in a programming language."
Obviously, "public" as part of a string is not syntax. How do you expect to
tell the Regular expression the difference? You must know the exact syntax
rules, and be able to express them in Regular Expression syntax.
Of course, this is unsuitable. The "\b" expression indicates the beginning
or ending or a word, that is a set of characters that is composed entirely
of word characters, and as I said before, '<' is not a word character.
Okay, now you've introduced the topic of XML, which was not part of the
topic in your earlier message, nor up until this point in your current post.
Yet, you have not stated what you mean by "syntax highlighting," nor what
this "syntax" is for. I could assume that you mean "XML syntax" but you have
not said so, so I cannot logically make that assumption. The string you're
parsing may only *contain* XML, as well as other "syntax."
Not necessarily. See my example (about "public") above. You need to be
*absolutely specific*.
Are you certain of this? What about line breaks? Might any of these "words"
be at the beginning or end of the string? If so, they will either not be
preceded by a space nor followed by one.
Okay, see, now you want to identify the '?' in an XML tag. But that is not a
word character, nor is it delimited from "xml" by a space. Again, the syntax
of the Regular Expression depends upon an *absolutely specific* description
of the rules for matching and grouping.
Not yet, but I hope to!
Hi Kevin,
thanks again! Okay, seems like I need to explain further what I need
For my application, I need a .NET textbox control where the user
can type XML, XSTL or HTML. And, it would be nice that the control can
do syntax coloring (and intellisense, later on), just like visual
studio does.
So I did a little test, by inheriting from RichTextBox control,
overriding OnTextChanged() and implementing something that already did
some syntax coloring with seperate words like 'public', 'class' etc;
but obviously, after your replies I now see that I did not define the
rules specific enough. The word 'public' in a string is not a keyword,
indeed. However, this test allowed me find away to completely avoid
flickering of the control, and now the next step is to do the syntax
coloring, with the rules of XML.
What the user enters into the textbox can be simple xml, like:
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Kevin</to>
<from>Ludwig</from>
<heading>Thank you</heading>
<body>Thanks for helping me!!</body>
</note>
or XSLT:
<?xml version="1.0" encoding="ISO-8859-1"?><xsl:stylesheet
version="1.0"
xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"><xsl:template
match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th align="left">Title</th>
<th align="left">Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template></xsl:stylesheet>
The idea is, that when the user types a character, the current caret
position in the textbox is taken to analyze the surrounding
words/characters, to see that there's a valid xml tag is formed, like
<body>, or <xsl:for-each select="catalog/cd">. If a valid tag has
formed, the tag name (body, xsl:for-each, select) should be colored in
a specific color (if it's in a list of valid tag names, but maybe we
can skip this for now). Also the < and > have to be colored in another
color. Attrributes also get another color (version, encoding, select).
Literals and other not defined xml elements like 'My CD Collection' do
not need coloring.
If the user deletes a character so that a tag becomes an invalid tag
(for example, deleting the >), then the coloring of the incomplete tag
has to be removed.
I hope that this time I explained the context better... didn't realize
that I had to be that specific.... so i's all about xml rules.