Comparing 2 MSG files .

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I am writing an application that generates a message digest (MD5) for a MSG
file generated by Outlook application. If I generate a MSG file each from 2
PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a copy of
FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message digest
different for these messages though the content of these messages are same.
Any reasons for this and workarounds ?
 
Some properties can and will be different, e.g. PR_SEARCH_KEY. Most likely
neither you nor your users care about these properties, but they will be
make the MSG fiel different.
Secondly, the order in which properties are streamed to the MSG file can be
different, so even if the mesasge is the same, the hash will be different.
You need to use a different algorithm to determine whether two messages are
the same, such as comparing the message body, subject, recipients, etc.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool
 
Thanks for the information. Initially I thought it may be the 'Entry ID'
which is the only thing different but you have given me some more insight to
this problem. Is there a way out to get the 'Message-ID' from the message,
may be that would help me solve this problem. Your help is this regard would
be highly appreciated.
 
Entry id is never stored in the MSG files since it only makes sense in the
context of the parent store, which MSG files do not have.
Which message id do you mean? The one from the MIME headers? Note that some
messages (e.g. the ones stored in the Sent Items of a PST store) do not even
have a MIME headers.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool
 
Dmitry, I have a similar problem:

We've written an application that processes e-mails we receive in Outlook
and computes their MD5 hash values. So far it works alright for all file
types, except attached .msg files (note we are not saving a certain message
in the .msg format, but rather saving an attached .msg file).
To our surprise, MD5 hashes vary for attached .msg files (and only for
these). I mean, if I save an attached MSG twice, first as 1.msg and then as
2.msg, their hash values differ. Why's that? I assumed Outlook handled
attached .msgs as if they were any other file type, but now it seems not
(other file types get identical hash values when saved twice).
Any clues?
 
Again, on the MSG file level, the binary data is irrelevant.
You must decide which set of properties constitutes a message, then
calculate a hash. E.g. a hash of concatenated body, subject, sender, sent
date, etc separated by 0x0 should do the trick.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool
 
Hi Dmitry,

The problem we have is that we treat every attachment as a file.
We copy every attachment in a temporal folder and hash the filetostring of
that temp.
I don't know how to retrieve those properties from a formerly attached .msg
which is now saved to a folder...

I also need some advice about this... We're using Visual Fox Pro for the
utility that processes incoming mails, because we are receiving fox tables,
but I'd like to switch to VB. Any recommendations/comments?

Thank you in advance
Diegol
 
You can save each attachment to a temporary folder using
Attachment.SaveAsFile, calculate the file hash (along with the filename),
then delete the file.
Can't give you any advise re. VFP vs VB: I don't use either of them, sorry.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool
 
Back
Top