Regular Expression Syntax

  • Thread starter Thread starter AAaron123
  • Start date Start date
A

AAaron123

I found this on the Internet and tried a few of them and they worked in
VS2008.
Actually it was in a different form but I converted to make a smaller file.
The data is the same as the original.

I'm confused about how regular expressions work in different systems.
I suspect that each system may have some things that do not work in other
systems.
So my question is: Do the things in the table below work on VS2008?
And what is Posix and machine mode?

Thanks in advance for any help.


BASIC METACHARACTERS
.. Match any single character
| Or
[] Match one of a set of characters
[^] Negate a set of characters
- Define a range of characters eg. [0-9]
\ Escape the next character
QUANTIFIERS
* Match zero or more of the previous character
*? Lazy version of *
+ Match one or more of the previous character
+? Lazy version of +
? Match zero or one of the previous character
{n} Match exact number of instances
{m,n} Match a range of instances
{n,} Match n or more instances
{n,}? Lazy version on {n,}
ANCHORS
^ Match start of string
\A Match start of string
$ Match end of string
\Z Match end of string
\< Match start of word
\> Match end of word
\b Match a word boundary
\B Opposite of \b
SPECIFIC CHARACTERS
[\b] Backspace
\c Match a control character
\d Match any digit
\D Opposite of \d
\f Form feed
\n Line feed
\r Carriage return
\s Match any white space character
SPECIFIC CHARACTERS (con't)
\S Match anything but white space character
\t Tab
\v Vertical tab
\w Match any alphanumeric character, digit or underscore
\W Opposite of \w
\x Match a hexadecimal number
\0 Match octal number
BACKREFERENCES & LOOKAROUND
() Define subexpression
\n Match nth subexpression
?= Lookahead
?! Negative lookahead
CASE CONVERSION
\E Terminate \L or \U
\l Convert next character to lowercase
\L Convert all characters up to \E to lowercase
\u Convert next character to uppercase
\U Convert all characters up to \E to uppercase
MODIFIERS
(?m) Multiline mode
POSIX
[:alnum:] Any letter or digit
[:alpha:] Any letter
[:blank:] Space or tab
[:cntrl:] ASCII control
[:digit:] Any digit
[:print:] Any printable character
[:graph:] Same as [:print:] but excludes space
[:lower:] Any lower case character
[:punct:] Any character that is in not [:alnum:] or [:cntrl:]
[:space:] Any whitespace character including space
[:upper:] Any uppercase character
[:xdigit:] Any hexadecimal digit
 
I found this on the Internet and tried a few of them and they worked in
VS2008.
Actually it was in a different form but I converted to make a smaller file.
The data is the same as the original.

I'm confused about how regular expressions work in different systems.
I suspect that each system may have some things that do not work in other
systems.
So my question is: Do the things in the table below work on VS2008?
And what is Posix and machine mode?

Thanks in advance for any help.


BASIC METACHARACTERS
. Match any single character
| Or
[] Match one of a set of characters
[^] Negate a set of characters
- Define a range of characters eg. [0-9]
\ Escape the next character
QUANTIFIERS
* Match zero or more of the previous character
*? Lazy version of *
+ Match one or more of the previous character
+? Lazy version of +
? Match zero or one of the previous character
{n} Match exact number of instances
{m,n} Match a range of instances
{n,} Match n or more instances
{n,}? Lazy version on {n,}
ANCHORS
^ Match start of string
\A Match start of string
$ Match end of string
\Z Match end of string
\< Match start of word
\> Match end of word
\b Match a word boundary
\B Opposite of \b
SPECIFIC CHARACTERS
[\b] Backspace
\c Match a control character
\d Match any digit
\D Opposite of \d
\f Form feed
\n Line feed
\r Carriage return
\s Match any white space character
SPECIFIC CHARACTERS (con't)
\S Match anything but white space character
\t Tab
\v Vertical tab
\w Match any alphanumeric character, digit or underscore
\W Opposite of \w
\x Match a hexadecimal number
\0 Match octal number
BACKREFERENCES & LOOKAROUND
() Define subexpression
\n Match nth subexpression
?= Lookahead
?! Negative lookahead
CASE CONVERSION
\E Terminate \L or \U
\l Convert next character to lowercase
\L Convert all characters up to \E to lowercase
\u Convert next character to uppercase
\U Convert all characters up to \E to uppercase
MODIFIERS
(?m) Multiline mode
POSIX
[:alnum:] Any letter or digit
[:alpha:] Any letter
[:blank:] Space or tab
[:cntrl:] ASCII control
[:digit:] Any digit
[:print:] Any printable character
[:graph:] Same as [:print:] but excludes space
[:lower:] Any lower case character
[:punct:] Any character that is in not [:alnum:] or [:cntrl:]
[:space:] Any whitespace character including space
[:upper:] Any uppercase character
[:xdigit:] Any hexadecimal digit

Regular Expression syntax various from implementation to implementation.
So you would probably be better off just starting here:

http://msdn2.microsoft.com/en-us/library/hs600312.aspx

Along with that, you might want to download Expresso:

http://www.ultrapico.com/Expresso.htm

Which is .NET based Regex development tool.
 
Back
Top