11 min read | 3072 words | 219 views | 0 comments
We've all been instilled with the old adage that email is insecure, but is it really? After all, most websites that use encryption (HTTPS) aren't considered insecure, and most of us are perfectly content to do things banking and filing taxes online. So, why is email different? Given that email has evolved in many of the same ways that other Internet technologies have over the years, I think this statement is worth picking apart in greater detail.
Everyone "knows" that email is insecure, and while this is certainly a heuristic, it's not necessarily a bad one. I'm guilty of indulging in it myself; whenever I've been consulted on IT policy regarding data transmission, I've also vehemently insisted against sending confidential data or documents by email, and using secure file transfer protocols like SFTP or FTPS (yes, some vendors are still using this) instead. Occasionally, I've gotten questions about this: after all, email is convenient, and if it's just within the organization, then where's the insecurity in that? These are not unreasonable questions to ask. Having recently written my own SMTP, POP3, and IMAP servers, and developed a good understanding of all three protocols and their usage today, I've been seeing a disconnect between conventional wisdom and relaity, and have been wondering the same kinds of things myself lately.
Email has been around a long time. Modern email dates back to the early 1970s, and many people were using email even before they had access to the Internet, thanks to BBSs and services like FidoNet that were more popular amongst consumers in the 1980s. Certainly, at the time, email was invariably insecure - no authentication, no encryption, and often no privacy - but then again, all Internet protocols at the time were like this. Most common Internet protocols started out this way, insecurely; after all, it was a different time then and security on the Internet has largely been an afterthought, tacked on later. HTTP, FTP, Telnet, SMTP, etc. are all good examples of this. Today, these protocols can all be encrypted (or there are other variants that are), but originally, that wasn't there.
Email in particular has some aspects that are unique to it. Email is designed to be a store and forward system, and back in the day, this was actually very common - emails would traverse many mail servers to get to their destination. Open relays were a particular challenge, as they allowed for mail to be relayed via third-party mail servers (which sounds like a disaster for authentication, and it was; fortunately, open relays are all but gone nowadays). But nowadays, email much less resembles the old store-and-forward model and much more a point-to-point model — most relaying that gets done is done either by the sender or the recipient's mail system, but the relaying is generally done within a single entity.
Let's quickly review the typical route that an email might take today. A user will send an email via SMTP from his or her mail client to the message submission agent on an outgoing email server (e.g. smtp.example.com). The message submission agent handles all the outgoing emails for a user. In the old days, it was more common to just connect to an SMTP server that was local to you at the time, but nowadays, you pretty much always connect to the SMTP server of your mail provider. This requires authentication and is usually encrypted either explicitly with STARTTLS (port 587) or implicitly using TLS (port 465) when the connection is set up; therefore, this is technically ESMTPSA (secure and authenticated enhanced SMTP).
(As an interesting aside, both explicit and implicit TLS remain in widespread use for message submission agents, and many mail providers support both. While implicit TLS on port 465 is technically not standardized by an RFC, it may be slightly more secure since the connection is encrypted from the get go, and won't be vulnerable to connection downgrade attacks if the client isn't configured to always use STARTTLS, even if it's not offered.)
From there, the message submission agent will, after checking that you're authorized to send that message, hand it off to a mail transport agent that will actually do the work of getting the message where it needs to go. If it's within the mail system, it might not even leave the server, but let's assume it's going to a different organization. The MTA will look up the MX records for the recipient's domain and try to connect to an SMTP server handling the incoming email for that domain. This is the leg of the journey that uses port 25 on the destination server. While this connection, like connections to port 587 of a message submission agent, are not initially encrypted, virtually all SMTP servers support encryption these days, and so assuming the receiving SMTP server offers encryption (indicated by STARTTLS in the EHLO response), the sending mail server will typically use STARTTLS to encrypt the connection before proceeding.
From here, the recipient's mail transport agent hands off the message to a local delivery agent, which may potentially use yet another variant of SMTP known as LSMTP (used within a provider only). There may even be many other steps that happen here, such as spam filtering, scanning for malicious attachments, etc. But eventually, the local delivery agent will hand off the message to the actual email store. From here, the recipient can access the message using either the older POP3 protocol or the more modern IMAP protocol. Again, these are virtually always encrypted these days, so the message is really downloaded using POP3S or IMAPS.
Two comments about this. First of all, the above typically happens in just a few seconds today — that is, email is delivered end to end relatively quickly. This isn't guaranteed, of course; the SMTP protocol itself is inherently asynchronous, which is why messages can sometimes show up much later than they were sent. But in a good majority of cases, assuming no time-intensive scanning is being done on the receiver's side, then this process may very well play out in realtime.
Secondly, in the above example, every leg of the connection was encrypted. There are really three parts to the route the message took: from the sender to the sender's SMTP server, from the sender's SMTP server to the recipient's SMTP server, and from the recipient's POP3/IMAP server to the recipient's mail client. These connections are almost always encrypted these days. So, in that case, how is this insecure?
One thing to consider is what is actually encrypted here. The messages themselves are not encrypted, only the transport used to route messages is encrypted. The messages may or may not be encrypted at rest on the mail servers. This is why technologies like PGP exist, to allow people to encrypt the message body itself. But PGP is arguably inconvenient and comes with many problems, which is why basically nobody uses it. S/MIME is the other common protocol encountered in email encryption, but there is no free way for individuals to get S/MIME certificates, which is why it is virtually unused outside of large corporate organizations. While I will acknowledge that these protocols do exist, and that some people do use them, since they have, for all practical purposes, gained almost no traction in email (and likely never will) we will not concern ourselves with them much here.
It is useful to make comparisons to other Internet protocols when considering the security of email; for example, websites using HTTPS are a good comparison. When we say a website is encrypted, we mean the connection (the transport) to it is encrypted using TLS. Applying that to email, it's pretty clear that email is, in fact, encrypted in transport, almost universally, so it can't be inherently insecure for that reason. This alone provides some merit to suspicion of claims that email is insecure. Either such claims rest on an obsolete understanding of how email transport works, or there must be other factors to consider. In many cases, it is probably both, but let us consider the latter here.
Even when the transport is encrypted, email may very well still be insecure. It will help to be more specific about what we mean when we refer to security here. There are several different aspects to security in email, and most Internet protocols in general. Broadly, we will consider three of them: encryption, authentication, and privacy.
Encryption is what people usually mean when talking about the security of Internet protocols. (Email may be an exception to that, as everyone insists it's insecure, even though email is almost always encrypted in transit.) Specifically, we'll consider the encryption of data in transit across the Internet here, since that's what's relevant when discussing encryption in the context of Internet protocols. Nowadays, most Internet traffic is encrypted; indeed, RFC 8314 has gone so far as to declare that "cleartext is obsolete". That may not be entirely true, but cleartext is email sent over the Internet is basically obsolete.
Encryption protects the confidentiality of a message in transit. It ensures that anyone monitoring the wire traffic won't be able to decipher the actual message being sent. Broadly speaking, it prevents third parties to the connection from reading the message. I say third party to the connection to mean someone that isn't the sender or receiver of the TLS connection that's been set up on top of the TCP connection; third parties could potentially still gain access to the message by other means, but we'll get to that in a moment.
Authentication, on the other hand, is about ensuring that the message is from who it really purports to be from. If somebody spoofs a message from the President and sends it to you, that message may be encrypted, but it is not authenticated. Today, we have several mail technologies, such as SPF, DKIM, and DMARC, all designed to help address exactly this problem. They're not perfect, but they work pretty well, particularly for domains that have deployed all three. (To be clear, these protocols don't prevent spoofing in any way, but they allow the receiver to reliably detect that a message has been spoofed.)
Finally, we have privacy. A message be encrypted and authenticated — and most legitimate emails these are both, in fact — but it may not be fully private. There are actually several reasons why this could be the case. Let's consider a couple.
Emails may not be private because access to the messages themselves is not restricted to the parties of the message. For example, most companies will have policies stating that all company email is the property of the company, that employees should not have any expectation of privacy in email communication, and that employers may monitor employee email at will for their own purposes. As an extreme example, I once worked at a small company with its own mail system, where everyone had access to everyone's mailbox, with no ACL system to speak of whatsoever. (Same thing with home directories, so the culture was one of complete and total openness.) The actual email may well have been secure, but it certainly was not private, given any employee could read it. But there's a key distinction here. It may have been reasonable to send confidential data to the company's email system, even from outside the company. The email would have been immune to interception by a third party in transit, but it would not be private to the recipient once delivered.
Many people use the "Big Tech" providers for their email these days — companies like Microsoft, Google, Yahoo, etc. If you do not run your own mail server, your email is inherently not private. This isn't to say that Google's employees are reading your email during their lunch breaks, but you don't have any visibility into what may or may not be done with it, and you have no guarantee that it's really truly private. For example, Google may index the emails of its Gmail users to allow them to search their emails quickly, which is probably a reasonable thing for them to do. However, you may not want Google indexing, say, Social Security numbers. Such actions may feel as if they have invaded a user's privacy. Contrast this with indexing done on a private mail server; you can control the indexing, and you know where the index lives, if you want to remove it.
Finally, email has a tendency to, shall we say, make a mess by getting "stuck" everywhere. Most people have multiple clients, so even if emails are received securely, they may be downloaded to a variety of devices which may or may not be adequately protected. People may archive their emails to external hard drives and then forget about them. After all, emails are basically just files, so they suffer from all the same data protection issues that regular files do. Thus, compared to other protocols, the number of attack vectors for the email messages themselves is huge — you don't need to compromise SMTP to gain access to messages; you don't even need to compromise POP3 or IMAP, you only need to compromise any part of the entire system that has access to messages.
There are other tangential issues we could consider as well: email is often subject to data retention requirements, for example, which further increases its stickiness if an organization is obligated to make backups of everyone's messages and retain them for long periods of time. Emails can be forwarded to anyone, including parties the sender may never have intended the message reach. Even though email is a somewhat malleable medium — you can detach or delete attachments from your messages, for instances, many people now use webmail, which is quite primitive and doesn't support this, so sensitive attachments may linger in mailboxes forever. The list goes on, but, I think these touch on several of the major reasons that email may not be as "private" as we would always like it to be. I'm sure you can think of many other reasons as well.
All this said, is email "secure"? I think the right answer is twofold: it depends on your notion of security — what's included in it and what's not — as well as your exact email architecture. There are obviously an innumerable number of outside factors that could contradict the security of a system, so it's impossible to take those all into account in any analysis. But given that email encryption and authentication is standard nowadays, these seem like two strikes against an argument against email being secure. Whether it is or not will depend on the other factors. For example, I have email accounts on a variety of servers, ranging from my own private mail server (with the emphasis on private), to the Big Tech providers that most people use for their email, to a foreign provider that probably mines and uses my messages in ways I'd rather not contemplate. Depending on which accounts I use for sending or receiving a message, this would strongly influence the level of security that I ascribe to those messages. Indeed, I would consider emails processed through the latter system to be "insecure", since I really have no guarantee that any of that data is truly private. However, in the case of my own mail server, I feel quite strongly that at least my part of the equation is secure. But email messages are a tale of two servers and (at least) two people, not one — both the sender and the recipient, so if the other party is using a sketchy mail server, then it doesn't matter how private my mail server is — messages can always leak from that server, not to mention any of the users themselves.
Like most other systems, emails are only secure as the weakest link in the entire chain, and perhaps the reason that email is commonly considered "insecure" is because that chain has so many links in it, of varying quality, and rarely can a single entity every have full control over the quality of every link in the chain. For this reason, unless you truly understand all of the servers that may involved in an email communication, it's best to err on the side of caution and assume that emails are not guaranteed to be private — and for that reason, they may not be secure enough for certain communications needs.
Returning to the original question, is email insecure? I think the best answer is "not necessarily", as email can be secure, at least according to the metrics reviewed above. A good example is two individuals, say Alice and Bob, who both have private mail servers that they control. Suppose that Alice sends a message from her client to her mail server, which relays it to Bob's mail server, and then Bob retrieves the message. Assume all the connections are encrypted using TLS (which is a very reasonable assumption). Furthermore, assume that Alice's mail server authenticates all submitted messages properly (which is also reasonable) and that Alice has SPF, DKIM, and DMARC records set up so that Bob can properly verify messages he receives from her domain. The message was fully encrypted in transit and was adequately authenticated. The remaining questions are does Alice trust her mail server and does Bob trust his? If the answer to both of these questions is "yes" (which is reasonable, since they both own their own mail servers), then to me it feels like this is about as secure as you can expect an email message to be, without actually encrypting the email plaintext itself.
Unfortunately, the example we have considered here, while once more common, is less common these days than ever before. Most people do not (and probably should not) manage their own email servers. But for those are technically inclined to do so, this approach may offer a method of email communications that can be used for more exchanging messages that the participants may not have been comfortable emailing in a different environment.
Thus, the heuristic "email is insecure" is perhaps not so much wrong as incomplete. Better would be "Assume that email is insecure, unless you can prove otherwise". I suspect most people will find they are not able to do so, and for this reason, it's still probably a good idea to avoid email for exchanging data that would be catastrophic if leaked.