regular expressions for various date/time strings

  • Thread starter Thread starter Keith G Hicks
  • Start date Start date
K

Keith G Hicks

I'm trying to create some code to find date/time strings in large strings.
The date/times could look like any of these:

11:00 AM, on December 10, 2008
10:00 AM o'clock, on June 20, 2008
Friday, the 27th day of June, A.D. 2008, at 10:00 o'clock A.M.
10:00 a.m. on JUNE 20, 2008
10:00AM on June 20, 2008
20th day of June, A.D., 2008, at 10:00 o'clock
Friday, the 27th day of June, A.D. 2008, at 10:00 o'clock A.M.
10:00 o'clock a.m., on Friday, June 27, 2008
10:00 o'clock A.M., local time, on TUESDAY, JUNE 24, 2008
18th day of June, 2008 at 1:00 o'clock pm Local Time
25TH day of JUNE, A.D. 2008, at 1:00 p.m.
Thursday, June 19, 2008, at 1:00 o'clock in the afternoon
01:00 PM o'clock, on June 18, 2008

So I need to create patterns to match any one of them and then extract the
date/time from the string so it can be posted to a databse. Here's my
attempt with the first sample above:

dim FullNoticeText as string = "Let it be known that Fred Smith is to
appear in court at 11:00 AM, on December 10, 2008 in Cook County..."

Dim result As String
Dim rxPattern As String = "\d\d[:]\d\d\ AM|PM[, on]
[Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|August|Sep|September|Oct|October|Nov|November|Dec|December]
\d\d[,] [20]\d\d"
Dim rx As New Regex(rxPattern)
Dim substrings() As String = Regex.Split(FullNoticeText.ToString, rxPattern)


If rx.IsMatch(FullNoticeText) Then
result = substrings(0)
End If

I'm not at all experiecned with the complexities fo regular expressions. I
understand it to a degree but I'm not getting very far with this. When I run
the above changing the array index from 0 to 1 to 2 I get results I dont'
really expect.


string 0 renders this: "Let it be known that Fred Smith is to appear in
court at "
string 1 renders this: ", on December 10, 2008 in Cook County..."
string 2 is invalid.

I was expecting this:

string 0 would render this: "Let it be known that Fred Smith is to appear in
court at "
string 1 would render this: "11:00 AM, on December 10, 2008 "
string 2 would render this: "in Cook County..."

I'm sure if I can get one or 2 of these patterns figured out and get the
code right for the substrings I can figure out the rest.

Thanks for any help,

Keith
 
I'm trying to create some code to find date/time strings in large strings..
The date/times could look like any of these:

11:00 AM, on December 10, 2008
10:00 AM o'clock, on June 20, 2008
Friday, the 27th day of June, A.D. 2008, at 10:00 o'clock A.M.
10:00 a.m. on JUNE 20, 2008
10:00AM on June 20, 2008
20th day of June, A.D., 2008, at 10:00 o'clock
Friday, the 27th day of June, A.D. 2008, at 10:00 o'clock A.M.
10:00 o'clock a.m., on Friday, June 27, 2008
10:00 o'clock A.M., local time, on TUESDAY, JUNE 24, 2008
18th day of June, 2008 at 1:00 o'clock pm Local Time
25TH day of JUNE, A.D. 2008, at 1:00 p.m.
Thursday, June 19, 2008, at 1:00 o'clock in the afternoon
01:00 PM o'clock, on June 18, 2008

So I need to create patterns to match any one of them and then extract the
date/time from the string so it can be posted to a databse. Here's my
attempt with the first sample above:

dim FullNoticeText  as string = "Let it be known that Fred Smith is to
appear in court at 11:00 AM, on December 10, 2008 in Cook County..."

Dim result As String
Dim rxPattern As String = "\d\d[:]\d\d\ AM|PM[, on]
[Jan|January|Feb|February|Mar|March|Apr|April|May|Jun|June|Jul|July|Aug|Aug ust|Sep|September|Oct|October|Nov|November|Dec|December]
\d\d[,] [20]\d\d"
Dim rx As New Regex(rxPattern)
Dim substrings() As String = Regex.Split(FullNoticeText.ToString, rxPattern)

If rx.IsMatch(FullNoticeText) Then
    result = substrings(0)
End If

I'm not at all experiecned with the complexities fo regular expressions. I
understand it to a degree but I'm not getting very far with this. When I run
the above changing the array index from 0 to 1 to 2 I get results I dont'
really expect.

string 0 renders this: "Let it be known that Fred Smith is to appear in
court at "
string 1 renders this: ", on December 10, 2008 in Cook County..."
string 2 is invalid.

I was expecting this:

string 0 would render this: "Let it be known that Fred Smith is to appearin
court at "
string 1 would render this: "11:00 AM, on December 10, 2008 "
string 2 would render this: "in Cook County..."

I'm sure if I can get one or 2 of these patterns figured out and get the
code right for the substrings I can figure out the rest.

Thanks for any help,

Keith

Check out Expresso, it's a very powerful Regex building tool that I
swear by. It includes a common expression library, as well as a GUI
builder for other expressions you might need.

http://ultrapico.com/Expresso.htm

Thanks,

Seth Rowe [MVP]
http://sethrowe.blogspot.com/
 
Ok. This is sort of working. I discovered that using "split" for what I need
to do is a dumb idea. I needed to use "Match" instead. That's working
better. I'm playing around with some sample date/time regular expressions to
see what I end up with. Any advice on that would still be helpful.

Thanks,

Keith
 
Back
Top