what can be trusted in email header?

hba2pd · Aug 2, 2007

Hello,

In my previous posts, I got a following response

///////////////////

Peter said:
There is information in the headers that you can trust, like the ip
address of the machine that handed off the message to your server,
and anything that happened after that transaction.

I'm not even sure I'd go so far as to say you can always trust the IP
address. While it is true the client can't tell the server that
header and
the server is the one that has to put it in there, there are ways to
fake a
source IP address.

And, even further, the originator isn't always the originator.
Consider a
hacked machine running a bot. It could be a end user box, it could be
a
server, but, that machine might be the one talking to the SMTP
server. So,
they are the 'originator' of the message, but, they aren't the reason
the
message exists. If that makes any sense.

That said, I'd trust being able to find the SMTP box that *accepted*
the
message on the Internet...
////

I think there are two opinions, one says that the IP address can be
trusted, and the other which says that it cannot. Would you advise me
which opinions are reasonable?

F. H. Muffman · Aug 2, 2007

hba2pd said:
Hello,

In my previous posts, I got a following response

///////////////////

I'm not even sure I'd go so far as to say you can always trust the IP
address. While it is true the client can't tell the server that
header and
the server is the one that has to put it in there, there are ways to
fake a
source IP address.

And, even further, the originator isn't always the originator.
Consider a
hacked machine running a bot. It could be a end user box, it could be
a
server, but, that machine might be the one talking to the SMTP
server. So,
they are the 'originator' of the message, but, they aren't the reason
the
message exists. If that makes any sense.

That said, I'd trust being able to find the SMTP box that *accepted*
the
message on the Internet...
////

I think there are two opinions, one says that the IP address can be
trusted, and the other which says that it cannot. Would you advise me
which opinions are reasonable?

Being one of the people who posted to that thread, both are actually
reasonable.

Truly faking an IP address is hard. But it can be done. I wouldn't expect
a fake IP address on a generic spam. If someone was attacking you, stalking
you, whatever, then I might be more concerned.

So, while you can assume (generally) that the IP address is correct in a
spam message, I would *not* assume that that message was *purposefully* sent
from that address. That machine may have been infected with a virus or a
worm that is now sending out spam messages without the user knowing.

hba2pd · Aug 3, 2007

Being one of the people who posted to that thread, both are actually
reasonable.

Truly faking an IP address is hard. But it can be done. I wouldn't expect
a fake IP address on a generic spam. If someone was attacking you, stalking
you, whatever, then I might be more concerned.

So, while you can assume (generally) that the IP address is correct in a
spam message, I would *not* assume that that message was *purposefully* sent
from that address. That machine may have been infected with a virus or a
worm that is now sending out spam messages without the user knowing.

Yes, I expect that someone might be stalking to me. In that case, is
there any way to tell whether IP address is faked or not?

Vanguard · Aug 3, 2007

in message

In my previous posts, I got a following response

The To, Cc, Subject, From, and other headers are all specified by the
*sender* of the e-mail. The e-mail client can enter whatever headers it
wants into an e-mail because those "headers" are NOT headers added by
mail servers. Those "headers" are just part of the body of the message
sent by the sender. In a message, there is the "header" portion at the
top, a blank line delimiter, and the "body" of the message. It is no
different than if you opened Notepad, specified whatever headers you
wanted at the top, added a blank line, and then the body of your
message. All of this is sent in the DATA command sent by the e-mail
client to tell the sending mail server what is the content of the
message.

The e-mail client displays *fields* within its UI where you enter a list
of Recipients. You entering recipients in the To, Cc, and Bcc fields in
your e-mail client are not what tells the mail server as to where your
e-mail gets delivered. When sending your message, your e-mail client
compiles an aggregate list of all recipients from the To, Cc, and Bcc
fields, and then it sends a RCPT-TO command to the mail server, one for
each recipient. So if you have 10 recipients (say 5 in the To field, 3
in the Cc field, and 2 in the Bcc field), the e-mail client sends 10
RCPT-TO commands to your mail server followed by a single DATA command
for the content of your message (which, remember, is the "headers",
blank line, and "body" sections of that one document).

In fact, listservers that send out bulk mails have the sender send them
the body of their message completely separately of the list of
recipients. The sender sends their listserver a list of all recipients
which is their mailing list maintained up on the listserver. Later the
sender decides what to send as the message and just sends that. The
list of recipients is already up on the mail server.

I'm not even sure I'd go so far as to say you can always trust the IP
address.

The only headers you can trust are those that were added by your
receiving mail server. If the spammer is operating their own mail
server then they can insert whatever delivery headers they want (headers
by mail servers get prepended to the message above the "header" section
that was already included in the message sent by the DATA command). So
the spammer can obviously insert whatever header they want. They can
also insert bogus headers by including them in the "header" portion of
the message that was sent in the DATA command.

The Received header added by *your* receiving mail server is the only
one that can be trusted. Every host knows the IP address of whatever
host connects to it. It is a requirement for TCP packets to pass
between them to send traffic. The receiving mail host will add its
Received header and show the IP address of the host that connected to
it. That is the only host identification that may be true in the
headers that got prepended to the message.

You can trace backwards through the Received headers. Your receiving
mail host's Received header is the topmost one (headers get prepended as
the message passes thorugh each mail server although relays may strip
out the headers and insert a whole new bunch just for itself). The "by"
host is your receiving mail host. The "from" host is who connected to
your receiving mail host. The next Received header (top-down) must
specify its "by" host is the same as the "from" host in the previous
Received header since they are supposed to chain together. Tracing can
be complicated since some mail servers will bounce mails within their
internal network and not compose full Received headers. If, at some
point, the "by" host in the next Received header doesn't match the
"from" host in the prior Received header then tracing stop since you
probably hit a bogus Received header. That is not an absolute, however.
Tracing through headers takes practice and sometimes they just don't
make sense, so the only one you can trust is the last prepended Received
header that was added by your own receiving mail host.

You could try using SpamCop's parser to see how well it traces through
the Received headers but, again, it isn't perfect. Although the RFCs
suggest how Received headers should be composed, there are far too many
"RECOMMENDED" and "SUGGESTED" conditions within the RFCs to enforce a
single format that would be enforced by all mail servers (because
receiving mail servers could then reject those without properly
formatted Received headers).

If you want help on tracing through e-mail headers, try posting over at
the SpamCop newsgroups (you can even use their NNTP server at
news.spamcop.net). They may have FAQs to help you out or you can Google
on article explaining how to trace through e-mail headers. Even then,
it takes practice and sometimes it just isn't worth your time.

The whole e-mail scheme is antiquated based on a very small-sized
community when the RFCs were created some 30 years ago. It was based on
a trusted model (i.e., senders were trusted primarily because the
community of e-mail users was pretty small). The trusted model doesn't
work anymore as it has long been abused. There have been attempts to
bandaid the trusted model with SPF (but spammers put in fake SPF records
in the header in trying to fool recipients that is was validated using
SPF) and domain keys (designed by Yahoo and which is flawed; see
http://www.bluebottle.com/domainkeys-is-flawed.php). Digital signing
was an attempt to identify the sender but then you can get freemail
certs from Thawte (now owned by Verisign) that do not identify anyone
and only identify the e-mail address to which the cert is registered.
Unfortunately there is strong resistance to generating new RFCs that
instead assume a non-trusted model for e-mail.

F. H. Muffman · Aug 3, 2007

hba2pd said:
Yes, I expect that someone might be stalking to me. In that case, is
there any way to tell whether IP address is faked or not?

Not easily. It would mean contacting the owner of the IP address and
verifying that the message did not pass through there system which isn't
necessarily something that a user would be able to find out. In addition to
what Vanguard said, I'd consider handing the whole message including headers
to the appropriate authorities and let them sort it out.

what can be trusted in email header?

hba2pd

F. H. Muffman

hba2pd

Vanguard

F. H. Muffman