Code to check for HTML messages being sent

  • Thread starter Thread starter Jarryd
  • Start date Start date
J

Jarryd

Hi,

I wanted to know if there was a way to check the format of a message in VBA?
I would like to check for outgoing messages that are formatted in HTML. If
they are HTML I want to then run through each line of HTML code and when one
is more than 999 characters it must be broken in to two lines.

For example:
<HMTL This is the very long line of code... now at 999 characters... and now
we have gone way past that...
2nd line...

This I want to change:
<HMTL This is the very long line of code... now at 999 characters...
and now we have gone way past that...
2nd line...

Any help with this would be greatly appreciated.

TIA,

Jarryd
 
Use item.BodyFormat, where item is the MailItem. That tells you the format
of the email, which is one of the olBodyFormat enum members. For HTML it
would be olFormatHTML.

If you want to do this on sending the item then you can trap the Item.Send
event for that item. In that event save the item and then do your parsing,
then let the send complete.
 
Hi Ken,

This is what I have already tried:
------------------
Public WithEvents myItem As Outlook.MailItem

Private Sub myItem_Send(Cancel As Boolean)
MsgBox "have a great day!"
End Sub
------------------

When I send any email nothing happens. I first tried this and sent an HTML
message:
------------------
Public WithEvents myItem As Outlook.MailItem

Private Sub myItem_Send(Cancel As Boolean)
If myItem.BodyFormat = olFormatHTML Then
MsgBox "have a great day!"
End if
End Sub
------------------

Neither of these is doing anything. What am I missing? Is MsgBox not a
good what to check that I am trapping the even correctly? I have turned off
security in Outlook 2007 by going Macro>Security>No Security...
I have also check that the Outlook VBA comm add-in is active.

Also, is it possible to edit the HTML code of the message using VBA? I
would imagine it is is quite straightforward, but I have been wrong to
assume so much in the past...

TIA,

Jarryd
 
Hi Slovak,

Ken,

To let you know, I am moving on here, but slowly:
--------------
Public WithEvents myItem As Outlook.MailItem

Private Sub Application_ItemSend(ByVal Item As Object, Cancel As Boolean)
Dim myItem As Outlook.MailItem
If myItem.BodyFormat = olFormatHTML Then
MsgBox "have a great day!"
End If
End Sub

Private Sub myItem_Send(Cancel As Boolean)

End Sub
--------------

I am getting the old variable or with block not set.

I am guessing that I have to do something like
set myItem = CodeThatSpecifiesTheMailBeingSent

Correct?? Any pointers good sir, please?

TIA,

Jarryd
 
OK,

Now I have the following:
------------------
Public WithEvents myItem As Outlook.MailItem

Private Sub Application_ItemSend(ByVal Item As Object, Cancel As Boolean)
If Item.BodyFormat = olFormatHTML Then
MsgBox "have a great day!"
End If
End Sub

Private Sub myItem_Send(Cancel As Boolean)

End Sub
------------------

This works well, I think anyway... but is it the right thing for what I want
to do. What I want to do is check the email body for any lines of HTML code
of over 999 characters. If any exist these must be broken up over two
separate lines. I think I have worked out that the this in the Private Sub
myItem_Send aint doing anything for me (don't laugh, haha :-) !!), whereas
with Private Sub Application_ItemSend I am on to a winner. With regards to
the code I will need to write for the editing of the HTML, any pointers
would be greatly appreciated.

TIA,

Jarryd
 
You need to instantiate the myItem object, otherwise that event will never
be fired. Just declaring an object doesn't set it to anything.

If this is for an open item then use the Inspectors collection's
NewInspector event to get the item and set it to myItem.

Private WithEvents colInspectors As Outlook.Inspector

Then assuming the code is in the ThisOutlookSession class module you use
Application.Startup to instantiate the colInspectors object. Your
NewInspector handler would do something like this:

If Inspector.CurrentItem.Class = olMail Then 'test for mail item
Set myItem = Inspector.CurrentItem
End If

That will allow the Send event to fire for the item.

I prefer not to use Application.ItemSend generally, it fires later and you
can't do some things in that event handler that you can do earlier in
item.Send.
 
Roger that Ken. Will do. Any tips on getting into the body of the message
once I've successfully got a hold of it at the right point, i.e. item.send
event?

It would be greatly appreciated as I haven't had a great deal of joy on the
net as of yet.

I am guessing that once I have set the myItem object I will be able to parse
its content, but I am not sure if I can do this directly on myItem or if I
have to take the content of myItem and set it to something else, if that
makes any sense.

TIA,

Jarryd
 
Hi Ken,

What I was think I would need to do is put all the lines of HTML data in the
body of the message in an indexed (multiples of 10) array, line by line.
Then I would delete the message body. Then I would run through each line in
the array and check it for the number of characters contained. For any that
contain more that 900 then I would insert an entry that is indexed 1 above
the affecting line. So if line 3 was too long containing 3000 characters it
would have an index of 30 and I would create three more lines and index
them:
31, containing 900 characters
32, containing 900 characters, and
333 containing 300 characters.

I would then sort the array in ascending order on the index and write it
back to the body of the message (without the index column that is).

That all sounds like a lot of mission. Am I perhaps over-complicating
things? Is there not a better way to get at it?

TIA,

Jarryd
 
Sounds over complicated to me. I'd just get HTMLBody as a string and then
break it down by line breaks. First I'd find the starting point of the
<body> tag and then look for the terminating ">" on that. Then I'd extract
all of the string between there and the start of the </body> tag, that's the
actual body content.

From there I'd look for breaks such as <br> and paragraphs and whatever
other breaks are in the text and count characters between the breaks.

Of course you also have to account for non-character tags such as <td>,
<tr>, etc. and for attributes such as font, color, etc.
 
Thanks for getting back to me Ken.

Right so what I have got now is:
-------------------------------
Public WithEvents myItem As Outlook.MailItem
Public WithEvents myOlInspectors As Outlook.Inspectors

Private Sub myOlInspectors_NewInspector(ByVal Inspector As
Outlook.Inspector)
If Inspector.CurrentItem.Class = Outlook.olMail Then 'test for mail item
Set myItem = Inspector.CurrentItem
End If
End Sub

Private Sub Application_Startup()
Dim myOlApp As New Outlook.Application
Set myOlInspectors = myOlApp.Inspectors
End Sub

Private Sub myItem_Send(Cancel As Boolean)
If myItem.BodyFormat = olFormatHTML Then
Dim msgCode As String
msgCode = myItem.HTMLBody
MsgBox msgCode
End If

End Sub
-------------------------------

This seems to work alright and I have a simple plan of how I will code the
rest. But this event is invoking that blasted Outlook security warning "A
program is trying to access email addresses... ". I have had a quick hunt
on the web and it looks like you can't disable it without hacks or add-ins.
Is there no way to tell Outlook that "Microsoft VBA for Outlook add-in" is
cool and not to mess with it?


TIA,

Jarryd
 
If you use Dim myOlApp As New Outlook.Application to instantiate your
Outlook.Application object you are not trusted even in the Outlook VBA
project. If the code uses the trusted Application object and derive all of
your other Outlook objects from that you should be trusted if the code is
running in the Outlook VBA project (if this is Outlook 2003 or higher).

So use this:

Dim myOlApp As Outlook.Application
Set myOlApp = Application

Then derive all of your other Outlook objects from myOlApp.
 
Hi Ken,

OK, I did what you said and the security warning isn't popping up anymore.

I know you have helped me a lot already, but do you have any ideas why this
code is not working?:
------------------------------------------------
Public WithEvents myItem As Outlook.MailItem
Public WithEvents myOlInspectors As Outlook.Inspectors

Private Sub myOlInspectors_NewInspector(ByVal Inspector As
Outlook.Inspector)
If Inspector.CurrentItem.Class = Outlook.olMail Then 'test for mail item
Set myItem = Inspector.CurrentItem
End If
End Sub

Private Sub Application_Startup()
Dim myOlApp As Outlook.Application
Set myOlApp = Application
Set myOlInspectors = myOlApp.Inspectors
End Sub

Private Sub myItem_Send(Cancel As Boolean)
If myItem.BodyFormat = olFormatHTML Then
Dim msgCode As String
Dim HTMLStartPos As Integer
Dim HTMLEndPos As String

msgCode = myItem.HTMLBody
HTMLStartPos = InStr(1, msgCode, "<HTML")
HTMLEndPos = InStr(HTMLStartPos + 1, msgCode, ">")
If HTMLEndPos > 999 Then
Dim LinesToMake As Integer
Dim OldTag As String
Dim ProcessTag As String
Dim NewTag As String
Dim NewMsgCode As String

LinesToMake = HTMLEndPos / 999
OldTag = Mid(msgCode, 1, HTMLEndPos)

For i = 1 To LinesToMake
ProcessTag = Mid(OldTag, IIf(i > 1, (i - 1) * 1000, 1), 999)
NewTag = NewTag & vbNewLine & ProcessTag
Next i
NewMsgCode = NewTag & vbNewLine & Mid(msgCode, HTMLEndPos + 1)
myItem.HTMLBody = NewMsgCode
End If

End If
End Sub
------------------------------------------------
If I reduce the mid criteria for the ProcessTag so I am not dealing with so
many characters and I do msgbox's at each (i) in the loop and one for the
NewMsgCode it looks like I should be alright. But if I send a message and
then check the source (View Source) it hasn't done it right. What I want it
to do is check if the HTML tag is bigger than 999 characters, if so rewrite
it by starting a new line at every 1000th character.

TIA,

Jarryd
 
Hi Ken,

I have checked and seen that I made a few silly mistakes in starting /
ending positions. But I don't think that's my problem. If reduce the
characters / per line to say 50 and MsgBox everything out, when I see the
MsgBox NewMsgCode output I can see that it is now perfect. However, it
isn't being written to the HTMLBody property.

What gives?? Please don't tell me that in spite of what I do Oultook 2007
is going to undo it all anyway.

TIA,

Jarryd
 
You need to find the beginning of this "<body" (case insensitive). That
marks the start of the body part. Before that it's all header and comments.
When you find that location ("<body") you then need to move your pointer to
the next ">" that isn't part of a nested tag. So you count all "<" tag
starts from your initial start position. Decrement that count for each
closing ">" you find until you find the matching close for your "<body" tag.

That's where you should start adding whatever it is you want to add. Before
that anything you add is just more header or comment information.
 
Hi Ken,

But the header info is the bit that is too long. I don't need to enter the
body, well not from the examples I have seen so far. I only need to edit
the <HMTL> tag. It has to be less than 1,000 characters or it risks being
bounced by overzealous firewalls / SMTP proxies. That is what my whole
mission is with this. The body content, <BODY> ... </BODY>, is cool. No
problem. It is the bits in the HTML tag itself that is messing everything
up:
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:p="urn:schemas-microsoft-com:office:powerpoint"
xmlns:a="urn:schemas-microsoft-com:office:access"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema"
xmlns:b="urn:schemas-microsoft-com:office:publisher"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet"
xmlns:odc="urn:schemas-microsoft-com:office:odc"
xmlns:oa="urn:schemas-microsoft-com:office:activation"
xmlns:html="http://www.w3.org/TR/REC-html40"
xmlns:q="http://schemas.xmlsoap.org/soap/envelope/" xmlns:D="DAV:"
xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml"
xmlns:ois="http://schemas.microsoft.com/sharepoint/soap/ois/"
xmlns:dir="http://schemas.microsoft.com/sharepoint/soap/directory/"
xmlns:ds="http://www.w3.org/2000/09/xmldsig#"
xmlns:dsp="http://schemas.microsoft.com/sharepoint/dsp"
xmlns:udc="http://schemas.microsoft.com/data/udc"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:sub="http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/"
xmlns:ec="http://www.w3.org/2001/04/xmlenc#"
xmlns:sp="http://schemas.microsoft.com/sharepoint/"
xmlns:sps="http://schemas.microsoft.com/sharepoint/soap/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:udcs="http://schemas.microsoft.com/data/udc/soap"
xmlns:udcxf="http://schemas.microsoft.com/data/udc/xmlfile"
xmlns:udcp2p="http://schemas.microsoft.com/data/udc/parttopart"
xmlns:wf="http://schemas.microsoft.com/sharepoint/soap/workflow/"
xmlns:dsss="http://schemas.microsoft.com/office/2006/digsig-setup"
xmlns:dssi="http://schemas.microsoft.com/office/2006/digsig"
xmlns:mdssi="http://schemas.openxmlformats.org/package/2006/digital-signature"
xmlns:mver="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns:mrels="http://schemas.openxmlformats.org/package/2006/relationships"
xmlns:spwp="http://microsoft.com/sharepoint/webpartpages"
xmlns:ex12t="http://schemas.microsoft.com/exchange/services/2006/types"
xmlns:ex12m="http://schemas.microsoft.com/exchange/services/2006/messages"
xmlns:pptsl="http://schemas.microsoft.com/sharepoint/soap/SlideLibrary/"
xmlns:spsl="http://microsoft.com/webservices/SharePointPortalServer/PublishedLinksService"
xmlns:Z="urn:schemas-microsoft-com:" xmlns:st=""
xmlns="http://www.w3.org/TR/REC-html40">

That is all one line. Its totally ridiculous and unnecessary. How do I
edit that? The bits in the HTML tag itself, not the body, the <HTML> tag.

TIA,

Jarryd
 
Then find the end of that <html> tag, but I think you're going to have
problems. That stuff is put in there by Word and then massaged by Outlook
when the item goes out. So what you are trying to do may work but it also
may interfere with what Outlook is trying to do. You'll have to experiment.
 
Hi Ken,

I am pretty sure that what I have written does find the HTML tag. As I
said, I MsgBox it all out and it works 100% in the message box. I can see
that it has the beginning of the HTML tag (<HTML) and it ends with >, and
everything has been split over lines the way I want it. But then when I
check the source code of the item in Sent Items it hasn't been adhered to.
I can affect what is in the <Body> </Body> tags with 100% success, but not
the HTML tag.

Could it be that I have to do this at another point in the process, e.g.
ItemSend rather than item.send? I recall you said that ItemSend fired much
later.

I have spent quite a lot of time on this and learnt a lot, but it would feel
like a terrible waste of time if what I am doing turns out to be impossible.

From what I understand the HTML tag can be split over multiple lines without
causing adverse affects. In truth it is rubbish that Word (or whatever
editor Outlook uses) does not comply to RFC standards, or at the very least
give you the ability to configure those parts that might cause
incompatibility with other systems, especially those known to to
non-compliant. For whatever reason OL 2007 wants to create massive <HTML>
tags, the kind that OL 2003 never did. This is causing us problems and I
would only like to make it do something that it really should do already.
MS must know about this standard. Check this out, scroll down to 4.5.3.1>
Text Line:
http://www.faqs.org/rfcs/rfc2821.html

It does state that this can be increased using SMTP server extensions, but I
can't find how my it can be extended to, and even if I could I can't
instruct other admins to do the necessary work to turn this on. Why is it
that MS do not give you a way to configure you mail editor to conform to
basic standards (without having to resort to plain text)? Ag, I'm ranting
now and its pointless and boring.

I guess what you are saying is that I have to play around with it, but to be
prepared for slamming into a brick wall, maybe permanently.

Thanks again,

Jarryd
 
What WordMail does, and that's the only editor available in Outlook 2007,
may be butt ugly but it does comply with standards. It may use HTML
extensions and comments but it does comply, as ugly as it is.

I think you'd find that you get the same results in Application.ItemSend as
you do in item.Send. The massaging occurs after the email leaves your
control in Outlook.

I can't comment on your HTML but what I do when I play with things like that
is to make my changes and then put the text into Notepad and save it as an
HTML file and see how it plays in IE. All I can say is I mentioned you might
not have success in what you want and if you do that you have to end up with
valid HTML.
 
Hi Ken,

Thank you for replying. I don't totally understand though. Of course the
HTML code is not necessarily bad, and it works fine in a browser. But this
HTML is being created by an email editor. And emails are predominantly sent
using SMTP. It is the SMTP compliance that I am talking about. RFC 2821
clearly states that there should be no more than 1,000 characters in a
single line of SMTP data:
~> text line
The maximum total length of a text line including the <CRLF> is 1000
characters (not counting the leading dot duplicated for transparency). This
number may be increased by
the use of SMTP Service Extensions.

As far as I know, HTML tags split over a number of lines will not make for
invalid HTML. I had already tested it in IE as you suggested in your last
post, and confirmed this point with a professional web developer. Why,
then, is this code not wrapped in lines of up to 1000 characters, no more?
If you send emails such as these that OL 2007 creates as it does, and I
realise that Word is the only editor available when using OL 2007 (full
feature set if Word 2007 installed), at times they will be rejected with
SMTP code 5.5.0, 500 Line too long by some servers. I know because it is
happening to me. It is perhaps an overly safe limit, but it works for all,
that's its point, so why not produce HTML emails in compliance with it? The
HTML would not be invalidated.

Anyway, it is the way it is and from what you say there is very very little
chance that I can do anything about it. So plain text it is then, as a
follow up to an NDR. I think it is a messy solution but without a touch of
rocket science it looks to be all I have to offer my users.

Thank again very much for all you help on this. It was very much
appreciated.

All the best,

Jarryd
 
I don't know, the way I read the normal WordMail HTML is that the long lines
are all in comments, I'm not sure if the RFC applies to comments. Besides,
this wouldn't be the first (or last) time that MS interprets RFC's in a
non-standard way or doesn't comply with the RFC's.

We just have to live with what MS gives us, whether it's compliant or not.
 
Back
Top