We all use the internet on a daily basis, be it for checking email, visiting websites, or just plain surfing. But how many of us know that the internet is inherently insecure? The original intent of the engineers who collaborated to develop it was to ensure that the internet was both distributed and highly adaptive so that in case of a nuclear war (remember, this was the 1960s), if one route on the internet pathway got knocked off, the overall system would still be able adapt itself around the other available routes.
In short, the internet had to be both decentralized and resilient enough to keep functioning in case a part of it got wiped off. To this end, it could afford to be slow (if several routes went kaput following a nuclear attack); and it could be insecure (the priority was carrying information, not necessarily hiding it); but it just could not afford to be dependent on a central route for pushing messages across the network of routes. In far fewer words, the internet had to be engineered to be an incredibly flexible system of routes with the intelligence to adapt itself to whatever resources it might have at its disposal. And there just couldn’t be a single point of failure.
To eliminate the problem of lack of information security inherent in it, the bright folks¹ who standardized how the internet works also collaborated to create several layers of security standards to ensure that this insecure internet was also able to carry information loads that had been secured by other means. In that sense, while the internet itself was one layer of transport, the security layer was built on top of this layer. This security layer that works atop the basic internet layer is based on the principles of cryptography. Broadly speaking, cryptography is the science of hiding information in plain sight.
Cryptography itself is based on three fundamental principles upon which the entire edifice of internet security rests, two of which we talk about in this article. To this end, we employ a real-world analogy to better aid our understanding of the concepts described in this article.
Imagine for a moment having a padlock that works with not one but two different keys; a lock that can be locked with one key (let’s call it Key A); and once locked with that key, it can be unlocked only and only with Key A’s counterpart, Key B. And it works the other way round as well: once our lock is locked with Key B, it can be unlocked only and only with Key A.
Sounds weird, I know. But bear with me for a while; this proposition is not as outlandish as it seems. Set your skepticism aside for a while and humor me if you will. For a moment imagine the possibilities of having such a bizarre padlock. What kind of possibilities would a padlock of this sort create? Would you lock your home with such a lock while you were away?
Let’s take this in another direction: imagine for a moment that you have an attache case that comes equipped with a lock precisely of this sort. Let’s also imagine for a moment that I have the facility to have several copies of a key for such a lock manufactured at will.
I would ask you to think about the various possibilities this would open. Write down on a piece of paper the possible uses that you can think of for a curious lock of this kind. Go on, I’ll wait. (Don’t read on until you have done so.)
Moving on, let me offer you a couple of scenarios… scenarios in which we need to securely and safely exchange documents without any risk of tampering, and also ensuring that the document’s secrecy is maintained. We should be able to exchange documents that are either important or confidential, or both. There is a difference between the two. Let me explain.
By important, within the context of this article, I mean a document that needs to have provenance of ownership or origin. This means that a document’s recipient needs to be absolutely sure that the document in fact originated from where it claims to have originated. For instance, if I send you a document by courier, there should be some way for you to determine beyond any shadow of doubt that the document indeed originated from me, and not from some impostor pretending to be me. Imagine receiving a message from your best friend asking you to transfer money to their bank account number specified in the message. Would you transfer the money without first verifying if that message indeed originated from your friend?
On the other hand, we all know what confidential means. A confidential document is meant to be read only by its recipient. Think of your credit card details as a message of this kind.
As it turns out, a locking system of the sort mentioned above would actually be a good fit for each of the scenarios described above, that is, where the recipient needs to have foolproof provenance of origin, and the sender needs to ensure confidentiality.
Assume that I live in London and you live in Manchester, and we regularly exchange documents that are either of an important or a confidential nature, or both. We both own attache cases with the peculiar lock just described. Let’s call my lock, Lock L (as in London), and your lock, Lock M (as in Manchester), with each lock having its own set of both kinds of keys.
Furthering The Idea
Taking the idea a little further, I retain Key A of my lock with myself, storing it in a secret and secure location within my home, and generally ensuring that no one other than me has access to it. On the other hand, I make several copies of its counterpart key, Key B, and these I distribute to all, including you, my friend in Manchester. Therefore, I can say that Key A is my private key, while key B is my public key.
Simultaneously at the Manchester end, you do a similar exercise: you retain Key A of your lock as your private key, and distribute copies of your lock’s Key B, or your public keys, to all and sundry. Consequently, we both have copies of each others public keys. On the other hand, each of us jealously guards our respective private keys, not sharing it even with our spouses, or even for that matter with our dogs.
I hope this is making sense. Reread the previous paragraphs in case you missed this point about public and private keys, because understanding this is crucial to grasping what follows.
Going further, imagine that I need to send some important paper documents across to you by courier. Now these documents, although they may not be confidential, are still important. (Think of a legal document like a will or a deed: while a will is not necessarily a secret document, its origin is vital known for proper provenance in a court of law.) There’s no way for you in Manchester to know if it is indeed I who sent the document to you. There is a very real fear that the document may get tampered with on its way from London to Manchester. In short, when you receive the document in Manchester, you must be absolutely sure that the document originated from me and no one else. How do we ensure this?
If you think a little harder, our lock is the solution to this problem. How? The answer is simple, really; I lock the attache case containing the documents with my private key. This way, I can be sure that my attache case can be opened using my lock’s corresponding public key (and no other key). Consequently, when you receive the attache case, you try opening it with my public key, a copy of which I had mailed to you earlier. If it does unlock with my public key, there’s no doubt left that the document in fact originated from me in London, and it has therefore not been tampered with during transit. In far fewer words, I have established the evidence of the document’s point of origin. Or to put it another way, I signed the document using a combination of my private and public keys.
That’s fine, you may say, but what if the document sent by me had been opened along the way by an interloper who also owns a copy of my public key as well? (It’s a public key, after all.)
That too is fine should an undesirable eventuality like this were to occur, because while the interloper would have been successful in opening the attache case addressed to you — which he had no business doing in the first place — and may have possibly altered the document, the fact remains that he will not be able to lock the tampered document back into the attache case, since, remember?, my private key is still with me.
There are three possibilities in an undesirable situation of this sort: a) you do not receive the attache case at all (it went missing); b) you receive the attache case but it is unlocked; or c) you receive the locked attache case but you are unable to open it with my public key.
In the first case, I can re-send the document to you. In the latter two cases you refuse to accept the document simply because its veracity has now been compromised. Bear in mind that this document, while important, is not necessarily confidential. What is critical here is that we are able to establish its origin. In the latter two cases, you can refuse to accept the document as bona fide.
Let’s take another scenario that is different from the one just described. What if the document is both important and confidential that you need to send to me from Manchester to London? In this case, you would take my attache case that you just received, and after putting the confidential document into it, lock it with my public key, thereby ensuring that the attache case can be opened only and only with my private key. You then dispatch the attache case to me in London. Now, no matter how hard the interloper (who by the looks of it appears to be in cahoots with the courier boy) tries, he will be unable to open the attache, simply because the attache case can be opened with my private key only. And so when I receive the attache case and I am also able to successfully open it with my private key, I can be sure beyond any shadow of doubt that no one else could have opened the case on its way to London.
But then, another question arises: while I know for sure that the document could not have been read by any meddler while it was in transit, there is still no way for me to find out if the document in fact originated from you. Again, the solution for this is simple, really. You first pack the confidential document into another attache case owned by you with your private key, and then you insert your attache case into my attache case, and lock it with my public key. At my end, when I receive my attache case and unlock it with my private key, I have the assurance that no one could have tampered with the case. I open my attache case and find your attache case with the claim that its yours. Figuring out if that is indeed so is easy, really: I try opening the inner attache case with your public key. If it opens, then I have verified at my end that the document has indeed originated from you.
Ergo, this mechanism establishes two vital facts: the first is that the document originated from you and only you; and the second is that the document could not have been possibly opened by anyone with along the way.
The attache case with our lock, you will have realized by now, is a pretty useful device in two quite different situations: one, when the origin of the package needs to be established; and two, when we need to ensure that the package is not tampered with during transit.
Our padlock described in this article is radically different from the more conventional locks that we are used to, the ones which are both locked and unlocked with the same key. In that sense, conventional locks can be said to have symmetric keys (since we use the same key for both locking and unlocking). On the other hand, the peculiar lock here can be said to have asymmetric keys, because it works with different keys. I hope this distinction between symmetric and asymmetric keys is beginning to make sense.
The PKI, or the Public Key Infrastructure works in a pretty similar manner to what has been described here, except that it does not employ physical locks and keys, but programmatic, or software-based ones. In the internet world, we call these locks algorithms.
Security algorithms are ways to “lock” information in a way that it becomes unreadable by everyone except the intended recipients, and this process is called encryption. On the other hand, the process of unlocking this information to make it human-readable once again is called decryption.
In computer science, an algorithm is a step-by-step procedure to achieve a particular programmatic objective.
Imagine that I have some secret text, which should be kept confidential, but which I would still like to send across to someone. At the same time, I would also need to ensure that it is non-readable by others. To this end, I would need to encrypt this secret text by applying an algorithm to it.
Let’s call our secret text plain text. And once our encryption algorithm is applied to the plain text, it is jumbled up into some gibberish; let’s call this unintelligible mumbo-jumbo cipher text.
Encryption is the procedure employed to turn plain text into cipher text.
Plain Text -> Encryption Algorithm -> Cipher Text
At the other end, when this cipher text is received by the intended recipient, they will be able to recover the original plain text by employing the same algorithm, this time for decryption.
Decryption is the procedure employed to turn unintelligible cipher text back to the original plain text.
Cipher Text -> Decryption Algorithm -> Plain Text
It naturally follows that the algorithm in both cases is the same, that is, the algorithm that “knows” how to encrypt plain text to cipher text should also “know” how to decrypt the cipher text back to the original plain text. Otherwise the algorithm would effectively be useless.
One important factor in this scheme of things that we missed here, however, is that the same algorithm is available to everybody. (Cryptographic algorithms for all practical purposes are in the public domain; one of the rules of cryptography states that knowing how the algorithm works should not allow an intruder to be able to compromise content encrypted on its principles.) Therefore, it naturally follows that the algorithm also have the facility to employ secret keys (“passwords”) that allow the user to encrypt the plain text uniquely and securely.
A real-world analogy should be helpful here. Think of your email account. Let’s say that my email id is [email protected] (it isn’t actually). To securely log in to my account, I would need to visit the gmail.com website. But then the same email website can also be used by Jane, who owns [email protected]. The question is, what prevents Jane from logging into my account? The answer is pretty obvious: my username and password, of course. That is, my login information (“credentials”).
Think of the algorithm as the email website, and my keys as the unique credentials that allow me to securely enter my account, while also allowing Jane to securely enter hers, without either of us being able to compromise the other’s account.
Similarly, we can convert plain text to cipher text using a unique key or password known only to ourselves like so.
Plain Text -> Encryption Algorithm (My Key) -> Cipher Text
And similarly, cipher text can be converted back to the original plain text using the same unique key or password like so.
Cipher Text -> Decryption Algorithm (My Key) -> Plain Text
A system of this sort works well when one employs the same key for both encryption (“locking”) and decryption (“unlocking”). In other words, the algorithm is symmetric.
On the other hand, when two different but related keys are employed for encryption and decryption, we have an asymmetric algorithm.
Plain Text -> Encryption Algorithm (Key A) -> Cipher Text
Cipher Text -> Decryption Algorithm (Key B) -> Plain Text
Hopefully, by now we are not just familiar but also comfortable with terms like symmetric and asymmetric keys.
There are several symmetric cryptographic algorithms like AES, Blowfish, DES, and 3DES, among others. You can read more about them here if you are so inclined https://en.wikipedia.org/wiki/Symmetric-key_algorithm#Implementations.
On the other hand, we have what is known as the RSA public key cryptosystem, which is one of the best known asymmetric public key cryptosystems used across the internet.
By now we are familiar with two of the three fundamental pillars of cryptography: symmetric and asymmetric algorithms. The third is another class of algorithms called hashing; that, however, is deferred for a later article.
The internet itself uses these algorithms for authenticating and securing websites so that visitors can be sure about the veracity of the website. The entire infrastructure used for generation of public/private keys, their distribution, and website authentication, among others, is known as the Public Key Infrastructure (PKI).
Of course, cryptography has many and varied usages. Think of cryptocurrencies like Bitcoin, which also employ the principles of cryptography extensively.
If you think reading this article has been worth your while, I would request you to sign up for my newsletter. And yes, it would mean a lot to me if you let others know by sharing this article on your social media channels. Thanks again 🙂
¹ Actually the internet wasn't designed by a committee; it evolved over several decades through the use of peer-reviewed proposals. The "bright folks" mentioned here do not mean a singular person or a group of such, but a set of different people, often computer scientists and professors, who were involved across several decades in the internet's evolution.