Introduction
Everyone knows that encryption is essential for the internet to run in any kind of usable capacity. Without it, there is no privacy, and so on, and so on. But how does it actually work? Well, in 1977, three MIT mathematicians (Ron Rivest, Adi Shamir, and Leonard Adelman) created the first public example of what is known as asymmetric encryption (symmetric encryption has been around for millenia, and was all the rage when Caesar was in power). I say public because a few years prior a guy named Clifford Cocks discovered the same system but, as he was working for the UK government, kept it secret. Luckily for us though, these three colleagues published their findings in A Method for Obtaining Digital Signatures and Public-Key Cryptosystems, and now we can play online slots. Yes, technically you can use symmetric encryption and just do a key exchange, but a key exchange itself is asymmetric, so there's no real purpose in doing this.
For the uninitiated, asymmetric encryption means that two parties do not have to share a predetermined private key in order to share a private message. This article is specifically about RSA encryption, the etymology you can likely figure out. However, before we can fully understand how the RSA cryptosystem works, there are a few basic ideas we must master.
Basics
The first of these is the modulus operator, generally shortened to mod. Simply put, this operator returns the remainder after division.
Next is the idea of being coprime(sometimes called relatively prime or mutually prime). Two numbers are coprime if they do not share a factor other than 1.
The final important idea is Euler’s totient function(also known as the Phi function), denoted as ϕ(n). This function returns the number of integers that are (1) less than n, and (2) coprime to n. Calculating ϕ(8) would look like this:
An important aspect of the totient function is that for a prime p, ϕ(p) = p - 1. Also, this function is multiplicative, meaning that for two integers x and y, ϕ(xy) = ϕ(x)ϕ(y).
Encryption
Now that we have all of the basics understood, we can begin the actual RSA algorithm. The first step is to choose two primes, p and q. We will choose small numbers to make it easier for ourselves, but generally these are huge. Later, we will discuss how these are actually chosen, but for now, p = 23, and q = 41. Now, we multiply p and q to get N. In our case, N = 943. The next step involves taking the Euler Totient of N, which, because the Totient function is multiplicative, is very easy. Now, we find a number e that is coprime with the result of this calculation.
As long as p and q are kept private, we can publish both N and e anywhere. Now that we have N and e, we can successfully encrypt a message! To do this, we will first turn our message into a number, and call it M. Let’s say that number is 35. Now, raise M to the e, and mod N.
We now have successfully encrypted a message using RSA encryption!
Decryption
Now that we have an encrypted message, we want to decrypt it. Currently, we have three public numbers, N, e, and C, which are 943, 7, and 545, respectively. We now want to find a d such that (ed)mod((p-1)(q-1)) = 1. Notice that p and q are not public numbers, which means finding d is very difficult, and would require some brute force program. However, since we know p and q, we can solve the equation and get an answer of d = 503.
Now that we know d, we can finally decrypt the message by raising C to d and mod N. This will result in our original message, M.
Raising 545 to the 503rd power results in a massive number, but luckily there are tricks that make this much faster. That is the entire process of encryption and decryption! Now I will go over how RSA is actually used, and some problems to watch out for.
Actual values
Because we have to turn our message into a number, encrypting a long message quickly requires massive amounts of calculation, which is why actual messages are generally not encrypted directly with RSA. Instead, we use RSA encryption to encrypt a private key that will then allow us to decrypt some symmetric cryptosystem, such as AES or a salt. Also, our practice simulation used very small numbers so it would be easier to tell what was going on, but in reality, these would be cracked instantly. In the real world, N is commonly 1024 bits, which means that p and q are both around 512 bits long each. Generally, the number 65537 is used as e. This is because it is both a large prime and easy to compute in binary, as it is 2^16 + 1.
Why RSA works
RSA encryption is difficult to break because it requires factoring huge numbers. If a bad actor is able to factor N into p and q, they can calculate everything else. Even on today’s fastest computers, this would still take somewhere in the ballpark of 300 trillion years if we are using standard length keys. However, the emergence of quantum computing is challenging this, and it is likely that in the near future quantum computers will be able to easily crack these cryptosystems.
Problems with generating primes
When generating p and q, there are a few things to keep in mind. The first is their size, as previously mentioned, and the second is how close they are to each other. For example, p and q should not be close to the square root of N. This is because of a clever method used to break RSA that uses Fermat’s factorization method, in which an integer is represented as the difference of two squares.
Here, p = a+b, and q = a-b. We start by assigning a to root N, and then incrementing a each time a solution is not found. If p and q are close together, we will be able to find them quite quickly. If they are not close together, though, then this strategy is barely any better than brute force, which is practically impossible to break.
Conclusion
If you'd like to learn more about encryption and the origins of RSA, I encourage you to read the original paper, which has more details about the specifics of picking certain values. Thanks for reading, and have a nice day!