A Stroll Through Encryption and Quantum Stuff
Posted on April 21, 2025
Recently, I watched a video explaining how a quantum computer could theoretically significantly speed up the process of breaking RSA cryptography. That left me with some lingering questions: first of all, why do we encrypt things? How does RSA work? What does it have to do with prime numbers? How does a quantum computer actually work? And so on. Unfortunately, we won’t answer all of these questions right now, but I’ll do my best to explore most of them.
In the spirit of “starting from the beginning”, let’s say you want to send a message to a friend, but you don’t want anyone else to read it. You can’t guarantee the message won’t be intercepted, so you and your friend come up with a cipher — basically, a way to scramble the message so that, if intercepted, it appears as gibberish and the original content remains protected.
The "simplest" kind of cipher you could try is called Caesar cipher (and yes, he used it), which shifts letters by some number — Caesar used 3, so A became D, B became E, and so on. The recipient only needs to know your "number" and perform the inverse operation to recover the original message. One can easily see that this method is not very secure.
Tangent — Simple Improvements to Encrypt Messages
A straightforward improvement to shift-based algorithms is to use a character mapping — effectively giving each letter its own unique substitution. For example, A → M
, B → D
, and so on. This makes the cipher more robust, since it’s harder to guess individual mappings than a single overall shift.
However, this method is still easily breakable using a frequency analysis approach. In any language, letters appear with specific frequencies. In English, for instance, the most common letter is “e,” followed by “t” and “a.” In Brazilian Portuguese, the order is “e,” “a,” and “o.” By analyzing how often each character appears in the encrypted message, one can often deduce the underlying mapping with ease.
A method developed in the mid-1500s by the Italian cryptologist Giovan Battista Bellaso — later (somewhat unfairly) attributed to a Frenchman and known as the Vigenère cipher — was the first to introduce the concept of using a key to encrypt a message. The idea is simple: the sender and receiver agree on a keyword, for example, PINEAPPLEKRYPTONITE
, and each letter of the message is shifted by the corresponding letter's index in the key. That means the first character of the message would be shifted by 15 (the index of 'P'), the second by 9 ('I'), and so on.
This method is considerably more robust against frequency analysis attacks. However, since the key must be repeated to encrypt longer messages, repeating patterns can reveal the key length. Once the key length is known, frequency analysis can be applied to each fixed-length segment, making the message vulnerable again.
A clever variant that begins to introduce the concept of prime numbers is the running key version of the Vigenère cipher. In this approach, two (or more) keys of different lengths are used to encrypt the message — effectively encrypting it multiple times. If the keys are relatively prime (i.e., they share no common factors), the effective key length is the least common multiple (LCM) of the individual key lengths. For example, using keys of lengths 10, 12, and 15 results in a combined key length of 60. But if you use 8, 11, and 15, the effective key length jumps to 1,320. It becomes clear that keys with shared factors produce more predictable, repeating patterns, making the encryption easier to break.
I’d love to go on a tangent about the Enigma machine and how the Allies managed to break it (it’s a fascinating story of math, codebreaking, and early computing), but let’s try to stay on track — for now.
RSA Cryptography
Let’s revisit some earlier concepts and give them proper names. When we used shifting methods and keys, we assumed that once we reached the last letter (Z, for example), we would simply wrap around to the beginning — starting again from A. This implies that, if you have 26 letters, the letter A could be represented not just by 1, but also by 27, 53, and so on. If we divide these numbers by the alphabet size, the remainder (or modulus) of the division gives us the actual letter index.
This concept is called modular arithmetic, and it plays a central role in modern cryptography. We express this kind of equivalence using the following notation:
$$ a \equiv b \pmod{n} $$
This reads as “$a$ is congruent to $b$ modulo $n$,” meaning a and b leave the same remainder when divided by n.
Let's go through the algorithm for generating an RSA key first, try encrypting/decrypting some messages, and then work on understanding why it all works. The principle behind the cryptography is to find three positive integers $e$, $d$, and $n$ (and yes, $e$ will be used to encrypt and $d$ to decrypt) such that:
$$ (m^{e})^{d} \equiv m \pmod{n} $$
This relation shows that the operation is reversible: if you perform modular exponentiation on a message $m$ using $e$ and then again using $d$, you get back the original message. These numbers are chosen so that, given only $e$ and $n$ (which make up the public key), it's really hard to determine $d$ (the private key).
- Choose two prime numbers $p$ and $q$
- Compute $n = pq$
- $n$ will be used for the modular operations and is part of the public key
- Compute $\Phi(n) = (p-1)(q-1)$
- Choose an integer $e$ such that $1 < e < \Phi(n)$, and $e$ is coprime with $\Phi(n)$ (i.e., they share no common factors)
- $e$ will be part of the public key
- Determine $d$ such that $ed = 1 \pmod{\Phi(n)}$
- $d$ will be kept secret as the private key
Let's try it with an example:
- Let's choose $p = 7$ and $q = 13$
- This gives us $n = 7 \times 13 = 91$
- Then, $\Phi(n) = (7 - 1)(13 - 1) = 6 \times 12 = 72$
- We need to pick an $e$ that is coprime with 72
- Exclude multiples of 2 and 3 (prime factors of 72)
- Some valid choices are 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 37, 41, 43, 47, 53, 55, 59, 61, 65, 67, and 71
- Let's pick $e = 7$
- Now we need to find $d$ such that $7 \cdot d \equiv 1 \pmod{72}$
- There are efficient algorithms (like the Extended Euclidean Algorithm) to find this, but let's brute-force a bit
- Trying values, we find that $d = 31$ satisfies the equation
- Let’s encrypt the message “B”, and assume $B \equiv 2$
- The encrypted message is $2^{7} \pmod{91} = 37$
- The decrypted message is $37^{31} \pmod{91} = 2 \equiv B$
- It worked!
The numbers $e$ and $n$ are released as part of the public key. The only known relationship is that:
$$ ed \equiv 1 \pmod{\Phi(n)} $$
This implies that, in order to compute $d$, one must know $\Phi(n)$, but $\Phi(n)$ depends on the original prime numbers $p$ and $q$ used to generate $n$. In other words, an attacker would need to factor $n$ into its prime components. And here’s the catch: there’s no known fast way to do this. Even with clever algorithms that are considerably faster than brute-forcing every number up to $\sqrt{n}$ to see if it divides evenly, the time it takes to factor $n$ is still far away from polynomial time.
How Long Would It Actually Take to Factor a 2048-bit RSA Key?
Factoring large RSA keys, such as a 2048-bit modulus (approximately 617 decimal digits), typically involves using the General Number Field Sieve (GNFS), currently the fastest known classical factoring algorithm. The complexity of GNFS is given by the formula:
$$ \exp\left[1.923 \cdot (\ln n)^{1/3} \cdot (\ln\ln n)^{2/3}\right] $$
where $n$ is the number to be factored.
To illustrate, factoring RSA-768, a 768-bit (232-digit) number, required about two thousand CPU-years of computation when successfully achieved in 2009. Scaling this computation to a 2048-bit RSA modulus dramatically increases complexity, making it practically infeasible with today's classical computational capabilities.
For a concrete estimation, assuming a computational capacity of $10^{18}$ operations per second, factoring a 2048-bit RSA number would require roughly $10^{17}$ seconds, or approximately 3.17 billion years, significantly exceeding any feasible timeframe.
Consequently, 2048-bit RSA keys are widely considered secure against classical factoring methods available today, underscoring their continued use in secure digital communications.
Shor's Algorithm
In 1994 Peter Shor, a computer scientist, showed that a quantum algorithm — now called Shor’s algorithm — can factor an $N$ bit integer in time polynomial in $N$, thereby threatening RSA encryption.
The first part of Shor's Algorithm consists of changing the problem of "finding factors" to the problem of "finding the period" of a function. First, let's try to get a grasp of what this period means. Let's take a sequence of numbers:
$$ 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024... $$
Now, let's see what happens to this sequence when we use $\bmod{n}$ — for example, let's assume that $n = 15$:
$$ 1, 2, 4, 8, 1, 2, 4, 8, 1, 2... $$
The sequence repeats itself with period $4$. In fact, every integer less than $n$ and coprime with $n$ forms a finite group under multiplication $\bmod{n}$ — for example, let's take $n = 15$. The elements of the group are the ones smaller than $n$ and coprime with $n$: ${1, 2, 4, 7, 8, 11, 13, 14}$. The product of any two numbers $\bmod{n}$ is an element of the group itself:
× | 1 | 2 | 4 | 7 | 8 | 11 | 13 | 14 |
---|---|---|---|---|---|---|---|---|
1 | 1 | 2 | 4 | 7 | 8 | 11 | 13 | 14 |
2 | 2 | 4 | 8 | 14 | 1 | 7 | 11 | 13 |
4 | 4 | 8 | 1 | 13 | 2 | 14 | 7 | 11 |
7 | 7 | 14 | 13 | 4 | 11 | 2 | 1 | 8 |
8 | 8 | 1 | 2 | 11 | 4 | 13 | 14 | 7 |
11 | 11 | 7 | 14 | 2 | 13 | 1 | 8 | 4 |
13 | 13 | 11 | 7 | 1 | 14 | 8 | 4 | 2 |
14 | 14 | 13 | 11 | 8 | 7 | 4 | 2 | 1 |
Note that because of this fact, any $x$ within the group can be multiplied by itself any number of times (that is, it can be exponentiated), and the result will stay in the group. Also, Euler's Theorem tells us that for any element $x$ of this group $\text{mod } n$ where $n = pq$, the relationship below holds:
$$ x^{(p-1)(q-1)} \equiv 1 \pmod{n} $$
Remember that when we defined the RSA keys we calculated the number $\Phi(n) = (p-1)(q-1)$. This number is an upper bound on the period $s$ of the modular exponentiation function — the true period always divides $\Phi(n)$ but can be (and usually is) smaller:
$$ f(r) = x^{r} \pmod{n} $$
So, if we can pick a random $x$ coprime with $n$, and find the period $s$, we have that $x^{s} \equiv 1 \pmod{n}$ — now, assuming we got lucky and the period $s$ is even, we can rewrite this equation as:
$$ x^{s} - 1 \equiv 0 \pmod{n} \ \Rightarrow (x^{\frac{s}{2}}-1)(x^{\frac{s}{2}}+1) \equiv 0 \pmod{n} $$
Now, if neither $x^{\frac{s}{2}}-1$ nor $x^{\frac{s}{2}}+1$ are multiples of $n$, this means we successfully found a factor of $n$ — let's take a moment to notice that this happens because when something $\equiv 0 \pmod{n}$, this means the rest of the division of something by $n$ is zero — that is, that something needs to be either a factor or a multiple of $n$. And if $n = pq$, we have just found either $p$ or $q$, and thus we can break the encryption.
Now, for our example of $n = 15$, choose $x = 7$, an element in the group (i.e., $1 < x < n$ and coprime with $n$). To find the period $s$, we need the smallest $r$ satisfying:
$$ 7^{r} \equiv 1 \pmod{15} $$
Let's try $r = 2$:
$$ 7^{2} \pmod{15} = 4 $$
which is not $1$. Trying $r = 3$:
$$ 7^{3} \pmod{15} = 13 $$
which is also not $1$. Then, for $r = 4$:
$$ 7^{4} \pmod{15} = 1 $$
So, the period is $s = 4$. Fortunately, since $4$ is even, we can calculate:
$$ 7^{\frac{4}{2}} - 1 = 48 \pmod{15} = 3 $$
and
$$ 7^{\frac{4}{2}} + 1 = 50 \pmod{15} = 5 $$
which yields the factors of $n$, $p$ and $q$.
So, Where the Hell is the Quantum Part?
You may have noticed something in the previous section: in order to find the period $s$, we had to guess several candidate numbers until one satisfied the required congruence. So in the end it is still a trial‑and‑error process involving many guesses.
But what if we could guess a whole bunch of numbers "at the same time", and then do something to extract useful information from this guess? Let's introduce some concepts that will allow us to work through this:
Fourier Transform
Previously, we noted that we are working with period functions that repeat with a period $s$. In physics, there is a useful concept called the Fourier Transform, which, given a periodic (or more generally, any) signal, decomposes it into its frequency components — and, conversely, reconstructs the signal from those components.
For example, if we have a signal such as this:
We could apply a Fourier Transform to find out which frequencies compose this signal:
This means we can use a Fourier Transform to find the period of a repeating function. The quantum equivalent of this is just called a "Quantum Fourier Transform" — let's not go into details, but the concept is the same.
Superposition and Measurements
One of the most mentioned properties of quantum systems is that they can be in many states simultaneously — this is called a superposition of states. For example, let's assume we have 4 qubits (quantum bits) — we could prepare a quantum state such as:
$$ |\psi\rangle = \frac{1}{4}\big(|0000\rangle + |0001\rangle + |0010\rangle + \dots + |1111\rangle\big) $$
This is the equal‑amplitude superposition of the binary numbers 0 through 15 (i.e., the integers 0 – 15). The prefactor $1/4$ ensures normalisation, because $(1/4)^{2} \times 16 = 1$.
Quantum physics does not allow us to access this state with all the information directly though. In order to extract information, we need to do a "measurement" — this will yield only one value (e.g. $|0110\rangle$), destroying all others. Also, because quantum systems can become "entangled", measuring one collapses the joint state and instantly determines the outcome in the other — a fact we will soon exploit.
Trying to Wrap Up
Finally, the (quantum part of the) algorithm runs like this. We have two "registers" (classically called the period register and the computational register), which are just different sets of qubits. First, we need to prepare in the input register a superposition of all possible exponents $r$:
$$ \dfrac{1}{\sqrt{Q}}\sum_{r=0}^{Q-1}|r\rangle $$
Let's break down what this equation means. The $1/\sqrt{Q}$ part is a "normalization factor" — in quantum physics, states must sum up to a total probability of 1. Probabilities are derived by "squaring" a state (technically, multiplying the state by its complex conjugate, since quantum mechanics involves complex numbers). The big $\sum$ symbol indicates that the full quantum state is represented as a sum of different "basis" states. Finally, the notation $|r\rangle$ represents a quantum state vector, with $r$ serving just as a human friendly label for the vector. In quantum computation, these states correspond to "qubits," which are the quantum equivalent of classical bits used to represent numbers in binary.
In quantum physics, since states are vectors, we can apply "transformations" to these vectors, like modular exponentiation - mathematically, we're multiplying a matrix representing the transformation to the state vector - after we do this we take the vector to another state - we simple way to think this is to apply a rotation matrix $H_{\theta}$ to a vector $|x\rangle$ - the result is the rotated vector - if $\theta = 90^\circ \Rightarrow H_{\theta}|x\rangle = |y\rangle$.
Then, we apply the modular exponentiation function to this system and write it to the output register (in practice, Shor's Algorithm implements this as "repeated squaring" — think of it as squaring a number multiple times — squaring is just multiplication, which can be easily implemented in a classical or quantum computer). Then, the two registers are in the state:
$$ \dfrac{1}{\sqrt{Q}}\sum_{r=0}^{Q-1}|r\rangle_{i}|a^{r}\pmod{n}\rangle_{o} $$
Let's try again to observe this with a set of 4 qubits — the input register $|r\rangle_{i}$ would be initialized with (let's switch from binary to integer notation and not worry about normalisation for now):
$$ |1\rangle_{i} + |2\rangle_{i} + |3\rangle_{i} + |4\rangle_{i} + |5\rangle_{i} + |6\rangle_{i} + |7\rangle_{i} + \cdots $$
Let's again use $7$ as our guess, so the modular exponentiation applied will be $|7^{r}\pmod{15}\rangle$ — the resulting state will have the remainders of the modular exponentiation for each $r$:
$$ |1\rangle_{i}|7\rangle_{o} + |2\rangle_{i}|4\rangle_{o} + |3\rangle_{i}|13\rangle_{o} + |4\rangle_{i}|1\rangle_{o} + |5\rangle_{i}|7\rangle_{o} + \cdots $$
If we measure, for example, $|7\rangle_{o}$ on the output register, this measurement will destroy every other state in the input register that does not yield $|7\rangle_{o}$ as a remainder. Remember the "entanglement" - the input register has a superposition of all numbers while the output register has a superposition of all remainders of the modular exponentiation, and they're entangled. So, if we measure the remainder $|7\rangle_{o}$, there cannot be any chance that there is a number in the input register that generates a remainder different than $|7\rangle_{o}$. What is left is the state:
$$ |1\rangle_{i}|7\rangle_{o} + |5\rangle_{i}|7\rangle_{o} + |9\rangle_{i}|7\rangle_{o} + |13\rangle_{i}|7\rangle_{o} + \cdots $$
See that the input register has a period of $4$ now (1, 5, 9, 13, ...). Going back to the general case, what is left in the input register can be written as:
$$ |r\rangle + |r + s\rangle + |r + 2s\rangle + |r + 3s\rangle + \cdots $$
If we could measure this state multiple times (remember that a measurement will always return a single state), we could rapidly infer the period $s$ — but we cannot, because the previous measurement we did on $|a^{r}\pmod{n}\rangle$ would yield a different remainder every time we did the experiment (its random), so we wouldn't be able to reach any conclusion.
Now, taking a huge leap and simplifying a lot, we apply the "Quantum Fourier Transform" to the system above — this won't return the period $1/s$ directly as we would expect from a classical system, but we'll have a superposition of states that interfere constructively at integer multiples of $1/s$:
$$ |1/s\rangle + |2/s\rangle + |3/s\rangle + |4/s\rangle + |5/s\rangle + \cdots $$
Now, if you measure this system, you'll find a multiple of the period $1/s$ (that is a fraction). Then, one can use a classical algorithm such as the continued‑fraction algorithm to find out $s$, or one could run the experiment multiple times to get a direct estimate of the probability distribution of the values.
By finding $s$, the last step is just to do the same checks as described in the classical part above (check if it is even, if it is not a multiple of $n$), and if they all pass, the algorithm successfully factored $n$.
Wrapping Up — What Should We Take Away?
We started with Caesar’s three‑letter shuffle and ended up knee‑deep in quantum Fourier transforms. Along the way we saw:
- Why we encrypt at all – to turn plain messages into harmless gibberish whenever an eavesdropper is in the middle.
- How classical ciphers evolved – from simple shifts to keyed substitutions, then to mathematically grounded schemes like RSA that lean on prime numbers and modular arithmetic.
- Why RSA is still safe today – factoring a 2048‑bit modulus with the best classical algorithm (GNFS) would outlast the Sun, so your online banking is not in imminent danger.
- Where the quantum threat comes from – Shor’s algorithm repackages factoring as a period‑finding task and solves that exponentially faster by exploiting superposition, interference, and the Quantum Fourier Transform.
- Reality check – practical, fault‑tolerant quantum machines with millions of logical qubits remain engineering moonshots, but the math shows the destination once we get there.
So, should you panic? Not really — but you should stay curious. Standards bodies are already rolling out post‑quantum cryptography (PQ‑C) algorithms designed to withstand both classical and quantum attacks. Over the next decade you’ll see web browsers, VPNs, and operating systems migrate to PQ‑C the same way they once moved from 1024‑ to 2048‑bit RSA.
Until then, RSA and its prime‑factoring friends keep humming along — elegant proof that some of the oldest ideas in number theory can guard terabytes of modern data. And if you ever feel lost in the alphabet soup (LCM, QFT, GNFS…), remember: every secure message you send is basically a very fancy Caesar cipher, stretched across twenty‑one centuries of mathematical ingenuity.
Happy encrypting — and, who knows, maybe one day happy quantum‑decrypting too.