Need help on regex.replace

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I'm a newbie to regular expressions. I've a requirement where in I need to
search a string and replace a pattern within it....
I want to replace the src attribute of the below string....
" <img src=\"c:\\windows\\windows\\desktop\\imagfes.jpg\">gfhsdfjsd<img
src="\c:\\windows\\win23\\images/jpg\\>jfdshfjdsh<img
src=\"c:\\win\\winnt\\images.gif."...................

My output should be
" <img src=\"d:\\images\\imagfes.jpg\">gfhsdfjsd<img
src="\d:\\images\\images/jpg\\>jfdshfjdsh<img
src=\"d:\\images\\images.gif."...................

can anyone help me out
 
Most likely escape on the ":" is the problem. This example works with your
string. You may need to tweak a little for your final purposes, but it works
based on what you have there:

Regex reg = new
Regex(@"(c\:\\windows\\windows\\desktop)|(c\:\\windows\\win23)|(c\:\\win\\winnt)");

string output = reg.Replace(source, @"d:\images");
--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

***************************
Think Outside the Box!
***************************
 
Here you go:

(?i)(?:<img\s*src=)["]?([^"\s]*)(\.[\w]{0,4})["]?

A little explanation:

The first 2 sequences indicate that the Regular Expression is
case-insensitive. I prefer to use these rather than having to remember what
flags to turn on.

The first group is a non-capturing group which matches "<img src=" (with any
number of spaces or line breaks between "img" and "src="). This identifies
an image tag.

The character sequence which follows indicates that there may or may not be
an opening quotation mark (this is not required in some flavors of HTML).
The second group is the first capturing group (group 1). It captures any
number of characters which are NOT either a white-space or a quotation mark.
This gives you the URL of the image, minus the extension.

The second group is the extension of the image.

What you need to do is find the end of the URL, which contains the image
name. This can be done using an overload of Regex.Replace, which takes a
MatchEvaluator as an argument. A MatchEvaluator is a delegate function to
which you pass a Regex.Match instance. It performs whatever processing you
want on the Match, and returns the "transformed" text, which replaces the
matched text in the string. The Regex.Replace method replaces each Match by
passing it to the MatchEvaluator delegate you define. Example:

private static string _NewUrl = "d:\\images\\";
private string ReplaceUrl(Match m)
{
// Get fileName from URL
// delimiter could be slash or backslash
int lastIndex, slashIndex, backslashIndex;
slashIndex = m.Groups[1].Value.LastIndexOf("\\");
backSlashIndex = m.Groups[1].Value.LastIndexOf("/");
lastIndex = Math.Max(slashIndex, backSlashIndex);
string fileName = m.Groups[1].Value.Substring(lastIndex + 1);

// Concatenate with new URL and return
return _NewUrl + fileName + m.Groups[2].Value
}

public string ReplaceImageUrl(string Html)
{
Regex r = new
Regex(@"(?i)(?:<img\s*src=)["]?([^"\s]*)(\.[\w]{0,4})["]?");
return r.Replace(Html, new MatchEvaluator(ReplaceUrl));
}

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
A brute awe as you,
a Metallic hag entity, eat us.
 
Back
Top