Writing Mails from Rust (1/3): Mail in general
A short view into some of the inner workings of mail.
This is the first part in a three part blog post about (e-)mails and how to
create, encode and send them using the mail crate (a
library).
- In this part I will introduce mails in general, list many of the standards involved in creating/sending a mail and gives an introduction as to how mails are structured internally.
- The next part will introduce the mail crate, its general structure and what it supports.
- The last part (coming soon) will, step by step, give an example where
handlebarstemplates are used to generate mails based on some user input. It also includes sending the generated mails to anMSAoverSMTP(both terms explained below).
How a mail is sent
At a high level there are six steps:
- A user creates the content they want to send.
- Normally done by the user’s mail program, e.g. using some web rich text editor or Thunderbird
- Content is converted to a mail
- Usually by the aformentioned mail program
- The mail is sent to a Message Submission Agent (
MSA).- The
MSAis normally identified by a domain name, e.g.,smtp.gmail.com. - Do not confuse this with the
MXentry in the domain name registry. - This likely will use the Simple Mail Transfer Protocol (
SMTP). - The MSA might slightly modify the mail, such as adding some signature to make it verifiable that the mail is actually sent by you.
- The
- The mail is transferred to the receiver's Mail Exchanger (
MX) which passes it to a Message Delivery Agent (MDA).- This might be done by a separate Mail Transfer Agent (
MTA) which gets the mail from theMSAyou sent it to. - The transfer from
MTAtoMXmight also be done bySMTP, although there are other protocols. - Note: when sending a mail from two mailboxes of the same provider, such as
a@1aim.comtob@1aim.com, this step may not happen in the classical sense. - In the past the mail might have gone through multiple hops, but that isn't really a thing anymore so it's not covered here.
- This might be done by a separate Mail Transfer Agent (
- The user retrieves the mail from the
MDA.- This is often done by
IMAPorPOP3if you use a mail program.
- This is often done by
- The mail program displays the mail.
- This is anything but simple as many parts of mail are underspecified as to how to exactly display certain parts or even just semantically interpret them.
The mail crate is mainly focused on creating and encoding mails so that they
then can be sent, through it also provides bindings to new-tokio-smtp to make
it simple to send mails to an MSA. In the future it might also support
parsing mails, but functionality such as displaying them or retrieving them
from an MDA using IMAP/POP3 is outside of the scope of the crate.
The mail standard(s)
TL;DR: There are many interconnected standards, standards replacing standards and standards updating standards making it easy to overlook some parts or misinterpret others.
Mail is pretty old and there are a large number of standards which have to be considered when implementing a program to create and send mails. Many of the standards also have one or multiple new standards which "replace"/obsolete the previous standard.
For example mail was first specified by the IETF in RFC 822, which on itself actually replaces RFC 733 which is a standard for some pre-mail text messages. The problem of RFC 822 (and IMHO many mail related standards) is that many parts where either rather vague or allowed many more possibilities then originally intended. For example RFC 822 allows using control characters like XON, XOFF in text in (some) headers. Many of the standards which then replaced or obsoleted RFC 822 did clarify and further restrict parts by deprecating them, i.e. the grammars for mail are now split into a "normal" part and a "obsolete" part which you still have to be able to parse due to backward compatibility but should never generate.
While often it's enough to use the latest standard in a chain of standards obsoleting each other, it's often not as simple as there are many other standards extending the existing standards. These standards always refer to the standard which had been the "newest" when they were released, but were not updated when that standard had been obsoleted and replaced by a newer one. Furthermore they still apply to the newer standard, which is always meant to be backwards compatible with the older standard (with respect to mail at least). This might require you to trace some feature or grammatical construct through the standards to find out what it actually means in context of the newest standard.
Below is a list of may relevant standards for creating mails. Note that for most RFCs in the list there are one ore more additional standards updating it. This is especially true for the MIME (Multipurpose Internet Mail Extensions) related RFCs:
| RFC | Description |
|---|---|
| 5322 | Internet Message Format (aka. mail) |
| 6532 | Internationalized Mail Headers |
| 2045 | MIME Part One: Format of Internet Message Bodies |
| 2046 | MIME Part Two: Media Types |
| 2047 | MIME Part Three: Message Header Extensions for Non-ASCII Text |
| 4289 | MIME Part Four: Registration Procedures |
| 6838 | Media Type Specifications and Registration Procedures |
| 2049 | MIME Part Five: Conformance Criteria and Examples |
| 2183 | Extends MIME, adds the Content-Disposition header |
| 2231 | Extends MIME, adds Encoded Words to support non-US-ASCII text in headers |
The next table is a list of some RFCs related to sending mails using the Simple
Mail Transfer Protocol (SMTP).
| RFC | Description |
|---|---|
| 5321 | Simple Mail Transfer Protocol |
| 6531 | SMTP Extension for Internationalized Mails |
| 6152 | SMTP extension, adding 8BITMIME |
| 3207 | SMTP extension, add transport layer encryption with STARTTLS command |
A simple mail
A mail based only on the mail standard consists of a number of headers followed
by a blank line followed by a single body. Both headers and the body can only
contain US-ASCII characters (i.e., 7-bit ASCII) and have a soft line length
limit of 78 characters and a hard line length limit of 998 characters excluding
the end of line sequence which is specified to be CRLF ("\r\n"). So neither
attachments, HTML mails, embedded images nor non US-ASCII characters are part
of the core standard.
Most of these features get added through the MIME standard(s) which adds
support for having multiple mail bodies (or whole mails) inside the main mail
body. Additionally it adds headers like Content-Type which allows specifying
what kind of data the body contains (e.g., text/html) and
Content-Transfer-Encoding which allows to encode the bodies with
base64/quoted-printable encoding allowing to include arbitrary non-US-ASCII
data like e.g. images. Additionally MIME defines encoded words which can be
used to have non us-ascii characters in mail headers, e.g., in the Subject
header. Lastly there are the standards around internationalisation which allow
you to directly use UTF-8 in most places but is not necessary always supported
by the receivers mail provider.
The problem with this is that you often have many ways how to handle certain
things (like UTF-8 in headers) but all of them tend to have some drawback. For
example with SMTP servers normally supporting the 8BITMIME extension you
can directly use non-US-ASCII bodies without needing transfer encoding, but
as the line length limit still applies and lines are still broken with \r\n
this doesn't really work for binary data and can even be a problem for UTF-8
data if your mail program cannot simply insert \r\n. E.g. if you have UTF-8
encoded JSON data you can only insert \r\n between fields and in some
languages it's not trivial to detect word boundaries. This leaves a situation
where it's almost always better to use base64. The quoted-printable
encoding can also be a good option, but is a bad idea if the send text is not
mostly US-ASCII characters as it can increase the length of a mail body by
threefold in the worst case scenarios, which for example, someone writing in
Arabic script would be the common case!
Below is a simple example mail, followed by an explanation of it:
MIME-Version: 1.0 From: <person1@example.com> To: Person Two <person2@example.com> Subject: Happy New Year =?UTF-8?B?8J+OiQ==?= Reply-To: No Reply <no-reply@example.com> Date: Tue, 8 Jan 2019 16:26:50 +0000 Message-Id: <worldunique1@example.com> Content-Type: multipart/mixed; boundary="=_^0" --=_^0 Content-Id: <worldunique2@example.com> Content-Type: text/plain; charset=utf-8 Hy there, it's the image. --=_^0 Content-Disposition: attachment; filename=the-image.png; modification-date="Mon, 7 Jan 2019 15:14:16 +0000"; read-date="Tue, 8 Jan 2019 16:26:40 +0000" Content-Id: <worldunique3@example.com> Content-Transfer-Encoding: base64 Content-Type: image/png TG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVyIGFkaXBpc2NpbmcgZWxpdC4gRn [...] VzY2UgdmFyaXVzIGxvYm9ydGlzIGludGVyZHVtLiBEb25lYyB1bHRyaWNpZXMgc2VtcGVyIGxlY3R1 --=_^0--
MIME-Versiondetermines the MIME version, there is and will ever only be 1.0Fromfrom whom the mail is, if there is more then one mailbox in that header theSenderheader must be given, too.Tois to whom the mail is sent. This can be one or multiple mailboxes.Subjectis the subject line=?UTF-8?B?8J+OiQ==?=is an encoded word in the subject header. It uses base64 to encode the UTF-8 encoded unicode emoji 🎉. The problem with encoded words is, that it is never clear if it an encoded word or the user just "happened" to type=?UTF-8?B?8J+OiQ==?=as it is not required that text which looks like encoded words are encoded as encoded words again to prevent confusion. The reason for this is that encoded words where later one added "on top" of "normal" text. But this also means that if there is a error and for example only=?UTF-8?B?8J+OiQ==is generated this is specified to not be an invalid encoded word for any client but the text as it is. Encoded words have some additional drawbacks, e.g., they have a maximal length limit which might require splitting text into multiple encoded words and they theoretically support "any" standardised encoding. There is also potential problems around handling cases where the split of two encoded words is in between two bytes representing the same UTF-8 code point and some small other stuff. An interesting feature of them is that you can specify a language additional to the encoding which can be a large benefit for screen readers.Reply-Toto which mailbox replies should be sendDatethe date when the mail had been created/sent outMessage-Ida world unique id. Any message in the world should have a different id.- Creating world unique IDs can be tricky. The starting step is that they end
with
@some.domainand you normally would use a domain you control, so you can be sure that only you create ids for this domain. The rest is often a large random id, or a hash of a timestamp or similar. The problem is that if a two mails (in your inbox) have the same id, this can cause havoc. It's also used to determine for which mail your response was. Content-Typewhat kind of data it is (see below).multipart/mixedspecifies that it contains a number of bodies with different content. Here the first body contains the actual mail and the second one contains an image (but we replaced the base64 encoded content with some text, which is wrong but readable).boundary="=_^0"specifies the boundary (here=_^0) which is used to separate all bodies in the multipart body. A line like--=_^0will indicate a start of a new body while a line like--=_^0--indicates the end of the last body. It's important that the body in the multipart body can not contain any sequence like--=_^0. Verifying this can be skipped if only bodies are used which are eitherquoted-printableorbase64transfer encoded as neither of the encoding can contain the character sequence=_^. In a real example the boundary would likely be longer than just=_^folowed by a0.Content-Dispositionindicates to the mail program how it should display the body. It also provides some additional meta-data about the body like a name (file-name) or when the file was read (read-date)- As it can be seen it is possible to split a header line into multiple lines. The split can be done at many (but not all) places and the new line must start with either a space or tab. It is also not allowed to contain only space/tabs.
Content-Idworks like a message ID (it's supposed to be world unique) but is for a specific body in a mail. This can e.g. be used to refer to an image from an HTML mail to directly display it there (e.g.<img src="cid:mycid@example.com"></img>). Images referred to in this way should have aContent-Disposition: inlineheader and should be in the samemultipart/relatedbody with the HTML body (or a body containing the HTML body).Content-Transfer-Encodingas the content of the body might be arbitrary binary data this allows encoding the body before adding it to the mail. Typical encodings arebase64andquoted-printable.
The mail above is relatively simple and contains no strange syntax. But a few
things should be taken note of: The From/To headers can contain multiple
mailboxes. Each consists of an optional display name and a mail address
surrounded by <>. The display name is a phrase which consists of one or
more words which might be whole quoted strings, or just normal words.
Additionally if it's not an internationalised mail and the display name
contains a non-US-ASCII UTF-8 character it needs to put the character (or the
whole word it's in) into an encoded word. Furthermore like most headers
mailboxes allow placing comments in many places and comments can by themselves
contain comments.
Both of following headers are semantically the same:
From: Max Musterman <maxmusterman@example.com>From: "Max"(export("dog")) =?UTF-8?Q?Musterman?= <maxmusterman(foobar)@example.com>
In the second variant Max was quoted (unnecessarily), then a comment was added
which contained another comment, following by using a quoted-string like
encoding of the UTF-8 text Musterman. Lastly another comment is added into
the mail address (foobar).
While this parts of the grammar are less relevant for generating mails as there is normally no need to produce such mails some parts like quoting and encoding need to be handled as the user will normally provide the display name as a simple UTF-8 string and the library has to check if it needs to quote or encode words, and if so, which ranges of characters from the input it should choose to quote/encode. E.g. encoding a whole word should be preferred over encoding of a single character. On the other hand it still should be displayed as one text in the end so it doesn't necessarily matter.
Media Types
Media types, also sometimes know as Mime-Types, are the things apearing as
values in Content-Type headers of a MIME body in a mail, or even in HTTP. It
just happened that while the specifications of media types for mail and HTTP is
mostly the same it has some small differences.
Common media types include text/plain; charset=utf-8 or image/png. They
specify what kind of data it is and some parameters which need to be known to
display the data (e.g. which encoding (charset) is used for text).
Here it should be noted that these media types just specify what kind of data
it is but not how to handle it. E.g. text/html; charset=utf-8 defines that it
is an HTML document but it doesn't mean that your mail program will display it
as such, or if it does, that it will display it correctly or support all
possible HTML element. Technically, it's possible to send a complete
website/web-game including non-inline JS and CSS with a mail, but likely it
won't be displayed "correctly" by your mail program. Some of these restrictions
are pure technical others are to prevent social engineering, protect from
viruses or protect your privacy.
RFC 6838 does a good job at specifying a fairly constrained grammar for media types and a how certain parts of it should semantically be interpreted. Sadly this RFC is useless when writing your programs as it only specify what newly registered media types should comply to. But even when just looking at registered media types they don't necessarily comply with the semantic constraints and when using a program "in the wild" having to handle non-registered media types is possible. RFC 6838 also doesn't constrain the problems/annoyances some of the things which can be done with media types parameters in mail can cause.
For example RFC 2231 extends media types to allow UTF-8 text in parameter values outside of internationalised mails (by percent encoding them) but also adds a way to split any parameter into multiple parts which can also be encoded but do not all need to be encoded but if ... Let's cut it short here it's basically a mess to be able to comply with the line length limit for long parameters like e.g. file names and it became worse through a likely unintended interaction with RFC 6532 (International Mail Headers).
Multipart bodies: Attachments & Embeddings
Like mentioned before MIME allows placing multiple bodies inside of a body of a
mail. Each of this bodies (including the "container" bodies) do have their own
headers and mainly differ in that they have a media type as Content-Type
which starts with multipart/ as well as a boundary parameter in that media
type.
As a body inside a body can further contain additional bodies in a recursive
manner this creates some form of tree of bodies. By combining different kinds
of multipart media types different effects can be achieved like e.g.
attachments.
Commonly used multipart media types are:
-
multipart/mixed: Which basically says it contains "mixed" content. It is mainly used as outer most body where the first body in it contains the actual mail and all other contain attachments. In the past it was also used to have a setup like a text body followed by an image body followed by a text body to embed images. But this is no longer needed astext/htmlbodies can refer to images via their content ID (seemultipart/related). -
multipart/related: States that all of its bodies are related and are normally displayed as "one" thing. For example when embedding an image into an HTML text you would put the HTML text as first body in themultipart/relatedbody and the image as the second one. Then you can give the image a content ID and refer to it from the HTML body through thecidscheme, e.g.src="cid:rvr00rw@1aim.com". -
multipart/alternate: Has multiple bodies which are semantically the same but represented in different ways. A common usage for it is to have both atext/plainandtext/htmlrepresentation of the mail so that if for some reason thetext/htmlrepresentation isn't displayed correctly the user still can access thetext/plainvariant. Note that the last body inmultipart/alternateis the one which should be displayed with the highest priority (e.g.text/html).
So if you want to create a mail which contains an embedded image, HTML text, alternate plain text and a PDF file as attachment you would have a MIME tree roughly like:
multipart/mixed ╠═multipart/alternate ║ ╠═text/plain ║ ╚═multipart/related ║ ╠═text/html ║ ╚═image/png ╚═application/pdf
Note that you should set the Content-Disposition: attachment header in the
application/pdf body and the Content-Disposition: inline in the image/png
body. Content-Disposition was added with RFC
2183 and helps the mail program to
display the mail in the way it was intended to be displayed. It additionally
contains a number of parameters including file-name and read-date which is
especially useful for attachments. Ironically the grammar of the parameters are
defined to be "the same as media type parameters". Which means they can have
all the annoyances like inconsistent UTF-8 encoding.
While the above tree of bodies is the "common" way to handle embedded content
and attachments there are little constraints in place about how bodies can be
combined. It is just that the more unusual the structure of the mail is the
less likely it will be displayed in a "nice" way (there is no correct way
anyway). E.g. you could place the multipart/alternate as the outer most body
and then place the multipart/mixed where above the text/plain is placed and
the text/plain and application/pdf bodies inside of it. Which theoretically
would create a mail which has an attachment if you view it as plain text but
not if you view it as HTML, though it's unlikely to be displayed in this way by
any mail program.
SMTP
The Simple Mail Transfer Protocol (SMTP) is a common way to send a mail to a
server. I will not go deeply into SMTP but there are a few things which should
be noted:
SMTPservers have a number of capabilities which describe if you can e.g. send internationalised mails or use8BITMIME, etc.SMTPwill start with a plain text transmission over TCP, before doing anything like authenticating or sending mails you should send theSTARTTLScommand and make sure it did not fail. Thenew-tokio-smtpcrate does handle that for you if you use theSMTPbindings provided with the mail crate.- The sender and recipient used by SMTP are not coupled to the
From/To/Senderfields in the mail! This means that someone could trivially pretend to be someone else, e.g.ceo@apple.com. There are methods like DKIM in place which allow Mail Exchange servers to detect the mismatch. Normally theMSAwill add the necessary parts to any mail you send so as long as you don't try to write aMSAbut only send mails through an existing one you don't have to bother with this. - If no explicit SMTP sender/recipient are given then the
mailcrate will try to derive them from theFrom,SenderandToheaders. - Sending an internationalised mail requires setting some special parameters
(
SMTPUTF8) when sending the mail, which means the SMTP library has to be told if it sending a "classical" or internationalised mail. - This allows the server to use different code for handling internationalised mails.
- It also affects the mail addresses itself as internationalised mail addresses can have a non-US-ASCII UTF-8 local-part (username). While any non-ASCII domain name can be escaped with puny code there is no such things for the local-part. As a side note: Non-US-ASCII local-parts of mail addresses are the only thing which can exclusively be done by internationalised mails. All other parts can be archived without them with some workarounds, like e.g. encoded words.
- For
SMTPthe mail is just a blob of data, while the server you send it to likely will still parse the mail to do things likeDKIMverification, spam detection , etc., TheSMTPprotocol on itself doesn't really care.