RegEx

  • Thread starter Thread starter Mac McMicmac
  • Start date Start date
M

Mac McMicmac

I need a regular expression that will match a string (retrieved from a
TextBox in the UI). I need to verify that the string represents a fully path
to a file.

I know these things can get pretty complex - so I'd be happy to keep it
simple by having the expression simply verify the text entered into the
textbox ends with .doc, .wav, or .pdf.

If it's not too much to ask, I'd like for the expression to ensure that
there is at least one \ followed by at least one character which preceeds
any of the three extensions listed (.doc, .wav, or .pdf).

Thanks
 
I hope this helps. Also O'Reilly book 'Mastering Regular Expression'
covers .NET.


Regex regexExts = new Regex( @"\b(?<root>.*)(?<ext>\.\w+)\b(?<rest>.*$)" );

Match parts = this.regexExts.Match( data );
if ( parts.Success.Equals(true) )
{
string data1 = data + ext;
//File.Exists( data1 ).Equals(true)
//Directory.Exists(data1).Equals(true)
}

data = c:\me\mine\yours\how.exe who what
root = c:\me\mine\yours\how
ext = .exe
rest = who what

OR

data = c:\me\mine\yours\how.exe
root = c:\me\mine\yours\how
ext = .exe
rest = <empty string>

You can test the ext string for .doc, .wav, or .pdf .


Robert
 
My regular expression knowledge is rusty, so don't count on this being the
most efficient. Here is one that should help you get started.

^[a-zA-Z]?:\\.+\.([dD][oO][cC]|[wW][aA][vV]|[pP][dD][fF])$

Just to explain:
^ the beginning of the string
[a-zA-Z] a character class that matches any single character in the
specified ranges
? 0 or 1 time
: a colon
\\ the escape sequence for a backslash
.. any character that is not a newline (usually)
+ 1 or more times
\. the escape sequence for a period
( the beginning of a group
[dD][oO][cC] "doc", regardless of the capitalization
| an "or", used between the other caps-independent
extensions
) the end of a group
$ the end of the string

This expression would have to be updated to allow UNC paths if your app
allows them to be entered. Right now, it only excepts filenames in the form
of c:\some\local\file.pdf. Another area that can be tightened in this
expression is the "." after the two backslashes, as Windows does have some
rules regarding certain characters which aren't allowed in file names.

Hope this helps.
 
Back
Top