find string if not preceded with @ anywhere in the string

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

we have the html text of an email and wish to search for banned content we use
[^@\-\.]\bsomedomain\.com\b

this will pick out somedomain.com but not (e-mail address removed) which is correct

but it will find (e-mail address removed)

how could you alter the above expression to exclude all words that contain an @ not knowing where it maybe in the string
 
Not quite what i'm looking for, i'm actually looking for specific words in html emails, but not if they are part of an email address

example


----- Original Message -----
From: <[email protected]>
To: <[email protected]>
Sent: Saturday, July 24, 2004 3:31 PM
Subject: Fax from 015616 received (tracking no. 141823)

A new fax has arrived from my.faxservice.com


i want to find the word my.faxservice.com but not when its an emailaddress


Thanks Paul








Jared said:
Paul,
This one should work, I found it on
http://www.regexlib.com/DisplayPatterns.aspx and modified it a little, (very
little)
Jared

\b(?<Address>(?:[0-9a-zA-Z](?:[-.\w]*[0-9a-zA-Z])*@(?:[0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9}))\b

Paul Durrant said:
we have the html text of an email and wish to search for banned content we
use
[^@\-\.]\bsomedomain\.com\b

this will pick out somedomain.com but not (e-mail address removed) which is
correct

but it will find (e-mail address removed)

how could you alter the above expression to exclude all words that contain
an @ not knowing where it maybe in the string
 
Paul,
I guess I misunderstood the original question, try this one. If it
doesn't fit your needs do a little research on lookahead and lookbehind,
both allow you to match a specific pattern without consuming them. Let me
know how this works!
Jared

[^\@\.]\b(?<=[^@])(?<url>(\w+[\w\.][\w\.]?)+\.\w+)+(?=[^@])\b


Paul Durrant said:
Not quite what i'm looking for, i'm actually looking for specific words in
html emails, but not if they are part of an email address

example


----- Original Message -----
From: <[email protected]>
To: <[email protected]>
Sent: Saturday, July 24, 2004 3:31 PM
Subject: Fax from 015616 received (tracking no. 141823)

A new fax has arrived from my.faxservice.com


i want to find the word my.faxservice.com but not when its an
emailaddress


Thanks Paul








Jared said:
Paul,
This one should work, I found it on
http://www.regexlib.com/DisplayPatterns.aspx and modified it a little,
(very
little)
Jared

\b(?<Address>(?:[0-9a-zA-Z](?:[-.\w]*[0-9a-zA-Z])*@(?:[0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9}))\b

Paul Durrant said:
we have the html text of an email and wish to search for banned content
we
use
[^@\-\.]\bsomedomain\.com\b

this will pick out somedomain.com but not (e-mail address removed) which
is
correct

but it will find (e-mail address removed)

how could you alter the above expression to exclude all words that
contain
an @ not knowing where it maybe in the string
 
Back
Top