Part 1: Elliptic curves, Shor's algorithm, and 90 million Toffoli gates
On March 30, 2026, five researchers from Google Quantum AI, UC Berkeley, the Ethereum Foundation, and Stanford — Ryan Babbush and Craig Gidney at Google Quantum AI, alongside Thiago Bergamaschi (UC Berkeley), Justin Drake (Ethereum Foundation), and Dan Boneh (Stanford) — published a paper with a simple title and an extraordinary claim: a quantum computer with roughly 500,000 physical qubits could break 256-bit elliptic curve cryptography — specifically, solve the Elliptic Curve Discrete Logarithm Problem (ECDLP) — in about 9 minutes.
That number matters because of what 256-bit elliptic curve cryptography protects. Every Bitcoin transaction is signed with it. Every Ethereum wallet depends on it. When your browser shows a padlock icon, the key exchange that secures that connection almost certainly uses it. SSH, Signal, WhatsApp, FIDO2 security keys — all of them rely on the same mathematical problem being hard to solve.
The paper doesn't announce a working attack. No quantum computer today has anywhere near 500,000 qubits — the largest machines have around 1,200 physical qubits (coincidentally the same number as the paper's logical qubit count, but these are noisy physical qubits — not the error-corrected logical qubits the paper requires). But previous estimates for breaking this cryptography required millions of qubits and hours of runtime. This paper brings the requirement down to half a million qubits and single-digit minutes. That's a qualitative shift — from "far future" to "possibly this decade."
In this article, we'll trace the paper's claims as far as we can from first principles — building the cryptography from scratch, deriving the naive gate costs, and identifying exactly where the paper's unpublished optimizations close the gap. By the end, you'll understand the chain from elliptic curves to 90 million Toffoli gates, and what remains for Part 2: physical qubits, runtime, and the 41% attack probability.
The paper's full title is "Securing Elliptic Curve Cryptocurrencies against Quantum Vulnerabilities: Resource Estimates and Mitigations." The authors span Google Quantum AI, UC Berkeley, the Ethereum Foundation, and Stanford — a collaboration between quantum hardware researchers and leading cryptographers. Here's what they claim:
THE GOOGLE ECDLP PAPER — KEY NUMBERS:
══════════════════════════════════════
Target: 256-bit elliptic curve (secp256k1, P-256)
Algorithm: Shor's algorithm, adapted for elliptic curves
Logical qubits: ~1,200
Physical qubits: ~500,000 (using error correction)
Toffoli gates: ~90 million
Runtime: ~9 minutes (superconducting qubits)
We'll trace each of these numbers as far as we can. But first, let's understand what they mean at a high level.
The target is the specific mathematical problem that protects Bitcoin and most of the internet's cryptography. The algorithm is a quantum algorithm discovered by Peter Shor in 1994, adapted here for elliptic curves. Logical qubits are perfect, error-free qubits — the idealized units of quantum computation. Physical qubits are the real, noisy hardware qubits needed to implement those logical qubits using error correction. The gap between 1,200 and 500,000 is the cost of making imperfect hardware behave perfectly — and understanding that gap is a major part of this article.
Toffoli gates are the expensive quantum operations in the circuit. T-state factories are dedicated hardware modules that produce the resources those gates consume. And the runtime is how long the whole computation takes on a fast superconducting quantum computer.
This is a resource estimate, not a demonstration. No quantum computer was used. The authors verified their circuit using a classical simulation combined with a zero-knowledge proof — a cryptographic technique that proves the circuit works without revealing how it works. We'll explain that mechanism in Part 2, Section 6 (coming soon).
The cryptography this paper targets isn't obscure. It's the most widely deployed public-key cryptography in the world.
| System | Curve | What it protects |
|---|---|---|
| Bitcoin | secp256k1 | Transaction signatures ($600B+ market cap) |
| Ethereum | secp256k1 | Wallet signatures ($200B+ market cap) |
| TLS 1.3 (HTTPS) | P-256 / X25519 | Every secure web connection |
| SSH | Ed25519 | Remote server access |
| Signal / WhatsApp | X25519 | End-to-end encrypted messages |
| FIDO2 / WebAuthn | P-256 | Hardware security keys |
All of these curves are 256-bit. All of them are vulnerable to Shor's algorithm for ECDLP. The paper's specific 500,000-qubit estimate targets secp256k1, but the same approach applies to other 256-bit curves with similar resource requirements.
The most concrete threat is to Bitcoin. When you send a Bitcoin transaction, your public key is broadcast to the entire network. It sits in the mempool — the waiting area for unconfirmed transactions — for roughly 10 minutes until a miner includes it in a block. During those 10 minutes, anyone who can see the public key and solve the underlying math problem can steal the funds.
But there's a subtler, larger threat. Any Bitcoin address that has ever sent a transaction already has its public key permanently recorded on the blockchain. An attacker doesn't need to race the 10-minute clock for those addresses — they can work on them at leisure, with no time pressure at all.
BITCOIN EXPOSURE:
═════════════════
Addresses that have NEVER sent a transaction:
Public key = Hash(actual public key)
Attacker sees only the hash → cannot extract the public key
SAFE (until you spend from the address)
Addresses that HAVE sent at least one transaction:
Public key is on the blockchain, permanently, in the clear
Attacker has unlimited time to compute the private key
Estimated: ~6.9 million BTC in such addresses
(per on-chain analyses as of early 2026 — this number
shifts as coins move between address types)
At current prices: hundreds of billions of dollars at risk, with NO time pressure
Ethereum faces a similar exposure. Unlike Bitcoin's UTXO model where each address can theoretically be used once, Ethereum uses an account model — the same address (and therefore the same public key) is reused for every transaction. Once you've sent a single transaction from an Ethereum account, your public key is permanently exposed.
And then there's the "harvest now, decrypt later" problem. Even for systems like TLS where keys are ephemeral (used once and discarded), a well-resourced adversary can record encrypted traffic today and decrypt it years later when a quantum computer becomes available. If your data needs to stay secret for more than a decade — diplomatic communications, medical records, trade secrets — the quantum threat is already relevant.
Here's the plan. We'll work through the paper's claims in the order you'd naturally ask about them:
One thing we will not do: derive Shor's algorithm. Shor's is the quantum algorithm that makes this attack possible, and it deserves its own deep treatment. We're covering it in our Zero to Quantum series. In this article, we treat Shor's as a black box: it needs a certain number of elliptic curve operations performed in quantum superposition, and we trace everything from that point forward.
With that contract established, let's start with the cryptography itself.
Everything in this section is classical mathematics — no quantum computing, no physics. If you can follow arithmetic with a calculator, you can follow this. We're building the lock before we show how to pick it.
An elliptic curve is defined by an equation of the form:
y² = x³ + ax + b
If you've seen the equation for an ellipse before — x²/a² + y²/b² = 1 — this looks different, and it is. Despite the name, elliptic curves are not ellipses. The name is historical, coming from the study of elliptic integrals (which arise when computing the arc length of an ellipse). The curve itself is a cubic — the x³ term gives it a distinctive shape with a smooth loop or a single swooping branch, quite unlike the closed oval of an ellipse.
For different values of a and b, you get different curves. Bitcoin uses a curve called secp256k1, which has a = 0 and b = 7:
BITCOIN'S CURVE (secp256k1):
════════════════════════════
y² = x³ + 7
If you plot this over the real numbers, you get a smooth curve — symmetric about the x-axis (because if y satisfies the equation, so does -y). It has one connected branch that extends infinitely to the right and curves through the y-axis.
But cryptography doesn't use real numbers. Real numbers have infinite precision, which is impractical for computers. Instead, we work in a finite field — we do all arithmetic modulo a large prime number p. This means every calculation wraps around: if a result exceeds p, we take the remainder after dividing by p.
MODULAR ARITHMETIC — THE CLOCK ANALOGY:
═══════════════════════════════════════
Regular arithmetic: 15 + 20 = 35
Modular (mod 23): 15 + 20 = 35 mod 23 = 12
It's like a clock with 23 hours instead of 12.
When you go past 23, you wrap around to the beginning.
Regular arithmetic: 7 × 8 = 56
Modular (mod 23): 7 × 8 = 56 mod 23 = 10
Every operation — addition, subtraction, multiplication —
wraps around. The result is always between 0 and 22.
A natural question: does this wrapping break anything? In regular arithmetic, addition has certain properties we rely on — you can add in any order (commutativity), group additions however you like (associativity), adding zero changes nothing (identity), and every number has a negative (inverse). Modular arithmetic preserves all of these:
WHY MODULAR ARITHMETIC WORKS AS A "NUMBER SYSTEM":
═══════════════════════════════════════════════════
All properties of regular addition still hold (mod 23):
Commutativity: 15 + 20 = 20 + 15 = 12 (mod 23) ✓
Associativity: (5 + 8) + 10 = 5 + (8 + 10) = 0 (mod 23) ✓
Identity: x + 0 = x for any x ✓
Inverse: 15 + 8 = 23 mod 23 = 0 (8 is the "negative" of 15) ✓
In mod 23, there are only 23 numbers: 0 through 22.
-15 doesn't exist separately — it IS 8 (because -15 + 23 = 8).
31 is also 8 (because 31 - 23 = 8). They're all the same number.
Every number has exactly one additive inverse, just like in
regular arithmetic — it's just that the inverse of 15 is
written as 8 instead of -15.
Same for multiplication:
Commutativity: 3 × 7 = 7 × 3 = 21 mod 23 = 21 ✓
Associativity: (2 × 3) × 4 = 2 × (3 × 4) = 24 mod 23 = 1 ✓
Identity: x × 1 = x for any x ✓
Inverse: Every non-zero number has a multiplicative inverse
(we'll use this in point addition) ✓
Mathematicians call this a "field" — a number system where
addition, subtraction, multiplication, and division all work.
The integers mod a prime p form a finite field with p elements.
This is important but subtle: the finite field is the number system we use to do arithmetic. It's the toolkit — it tells us that addition, multiplication, and division of numbers mod p all behave correctly. We need this toolkit because the elliptic curve formulas involve all three operations.
But the finite field is not the cryptographic structure itself. The cryptography is built on a different group — the set of points on the curve, with a completely different "addition" operation (a geometric rule we'll define in Section 2.2). The finite field is the language the formulas are written in; the curve points are the objects we actually work with. Think of it this way: the finite field is like the rules of arithmetic you learned in school, and the elliptic curve group is a game built using those rules.
So Bitcoin's curve is actually:
y² ≡ x³ + 7 (mod p)
where p = 2²⁵⁶ - 2³² - 977
That's a 256-bit prime number:
p = 115792089237316195423570985008687907853
269984665640564039457584007908834671663
Over the real numbers, the curve is a smooth line. Over a finite field, it becomes a scatter of discrete points — every (x, y) pair where x and y are integers between 0 and p-1 and the equation holds. The animation above shows this transition: the smooth curve dissolves into a cloud of seemingly random dots. There are roughly p such points (about 2²⁵⁶).
The set of all these points, together with a rule for "adding" two points, forms a group — a second group, distinct from the finite field of numbers. The finite field is the toolkit for arithmetic. This new group of curve points is the structure that cryptography is built on.
The rule for "adding" two points on an elliptic curve is geometric. Given two points P and Q on the curve, draw a straight line through them. That line will intersect the curve at exactly one other point. Reflect that point across the x-axis. The result is P + Q.
In the animation, notice how the three intersection points (P, Q, and R') are spread across the curve — P on the upper-left branch, Q on the lower-middle, and R' on the lower-right. The line cuts through the curve at exactly three places.
Why exactly three? Because of the algebra. The curve equation is y² = x³ + 7, and a line is y = mx + c. Substituting the line into the curve gives (mx + c)² = x³ + 7, which rearranges to x³ - m²x² - 2mcx - c² + 7 = 0. That's a cubic equation in x — degree 3. Since we started with two real points P and Q on the curve, two of the three roots are real, and for a cubic with real coefficients that forces the third root to be real too (complex roots always come in pairs). So the third intersection point always exists. We don't even need to solve the cubic: Vieta's formulas tell us the sum of all three roots equals m² (the coefficient of x²), so the third root is simply x_R' = m² - x_P - x_Q. That's exactly the formula we used in the worked example.
Now, why does this strange geometric rule deserve to be called "addition"? We claimed it satisfies commutativity (P + Q = Q + P), associativity ((P + Q) + R = P + (Q + R)), has an identity (∞), and every point has an inverse. Let's check each one:
Let's make this concrete with actual numbers. We'll use a tiny curve — y² = x³ + 7 (mod 23) — so we can verify every step by hand. The real Bitcoin curve uses a 256-bit prime, but the arithmetic is identical.
First, we need valid points. A point (x, y) is on the curve if y² mod 23 equals x³ + 7 mod 23. Let's find some:
FINDING POINTS ON y² = x³ + 7 (mod 23):
════════════════════════════════════════
Squares mod 23: {0, 1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18}
(These are the values y² can take. If x³+7 mod 23 isn't
in this set, there's no point at that x.)
x = 1: 1³ + 7 = 8 mod 23 = 8. 8 is a square (10² = 100 mod 23 = 8).
y = 10 or y = 23-10 = 13.
Points: (1, 10) and (1, 13) ✓
x = 10: 10³ + 7 = 1007 mod 23 = 18. 18 is a square (8² = 64 mod 23 = 18).
y = 8 or y = 23-8 = 15.
Points: (10, 8) and (10, 15) ✓
Now let's add P = (1, 10) and Q = (10, 8). But first, where do the formulas come from? They follow directly from the geometry we just described:
DERIVING THE POINT ADDITION FORMULAS:
═════════════════════════════════════
The line through P = (x_P, y_P) and Q = (x_Q, y_Q) has slope:
s = (y_Q - y_P) / (x_Q - x_P)
The line equation is: y = s×(x - x_P) + y_P
Substituting into the curve y² = x³ + 7:
[s×(x - x_P) + y_P]² = x³ + 7
Expanding and rearranging gives a cubic in x:
x³ - s²×x² + ... = 0
By Vieta's formulas, the sum of the three roots equals s²:
x_P + x_Q + x_R' = s²
So the third intersection point has x-coordinate:
x_R' = s² - x_P - x_Q
After reflecting across the x-axis (negating y):
x_R = x_R' (x doesn't change)
y_R = s×(x_P - x_R) - y_P (from the line equation, negated)
Every formula in the worked example below comes from this derivation. Let's trace it with actual numbers:
POINT ADDITION: P = (1, 10) + Q = (10, 8) on y² = x³ + 7 (mod 23)
═══════════════════════════════════════════════════════════════════
STEP 1: Compute the slope of the line through P and Q.
s = (y_Q - y_P) × (x_Q - x_P)⁻¹ mod 23
Numerator: y_Q - y_P = 8 - 10 = -2 mod 23 = 21
Denominator: x_Q - x_P = 10 - 1 = 9
We need 9⁻¹ mod 23 — the modular inverse of 9.
This is the number that, multiplied by 9, gives 1 mod 23.
9 × 18 = 162 = 7 × 23 + 1 = 161 + 1 → 162 mod 23 = 1 ✓
So 9⁻¹ mod 23 = 18.
Slope: s = 21 × 18 mod 23 = 378 mod 23 = 10
(378 = 16 × 23 + 10)
STEP 2: Compute x_R (the x-coordinate of the result).
x_R = s² - x_P - x_Q mod 23
x_R = 100 - 1 - 10 mod 23
x_R = 89 mod 23 = 20
(89 = 3 × 23 + 20)
STEP 3: Compute y_R (the y-coordinate of the result).
y_R = s × (x_P - x_R) - y_P mod 23
y_R = 10 × (1 - 20) - 10 mod 23
y_R = 10 × (-19) - 10 mod 23
y_R = -200 mod 23 = 7
(-200 + 9 × 23 = -200 + 207 = 7)
RESULT: P + Q = (20, 7)
VERIFY: Is (20, 7) on the curve y² = x³ + 7 (mod 23)?
Left side: 7² = 49 mod 23 = 3
Right side: 20³ + 7 = 8007 mod 23 = 3
(8007 = 348 × 23 + 3)
3 = 3 ✓ The point is on the curve.
Every step is arithmetic you can check on a calculator. But the modular inverse deserves a closer look, because it's the key operation that makes "division" work in modular arithmetic.
In regular arithmetic, dividing by 9 means multiplying by 1/9. The number 1/9 is the multiplicative inverse of 9 — the number that, when multiplied by 9, gives 1. In modular arithmetic, we need the same thing: a number that, when multiplied by 9, gives 1 mod 23.
MODULAR INVERSE — "DIVISION" IN MOD ARITHMETIC:
════════════════════════════════════════════════
In regular arithmetic:
9 × (1/9) = 1
So 1/9 is the inverse of 9.
In mod 23 arithmetic:
We need x such that 9 × x ≡ 1 (mod 23).
Try x = 18: 9 × 18 = 162 = 7 × 23 + 1 → 162 mod 23 = 1 ✓
So 18 is the inverse of 9 (mod 23).
"Dividing by 9" in mod 23 means "multiplying by 18."
Why does every non-zero number have an inverse mod a prime?
Because p = 23 is prime. For any a not divisible by p,
Fermat's little theorem guarantees:
a^(p-1) ≡ 1 (mod p)
So:
a × a^(p-2) ≡ 1 (mod p)
The inverse of a is a^(p-2) mod p. It always exists.
For 9 mod 23: 9^21 mod 23 = 18 ✓ (same answer)
This is why we need a prime modulus. Let's see what goes wrong if p isn't prime. Say we try mod 12 instead of mod 23:
WHY THE MODULUS MUST BE PRIME:
══════════════════════════════
Mod 23 (prime): every non-zero number has an inverse.
9⁻¹ mod 23 = 18 (because 9 × 18 = 162 = 7×23 + 1) ✓
Mod 12 (not prime, 12 = 2 × 2 × 3): some numbers have NO inverse.
Try to find 6⁻¹ mod 12 — a number x where 6 × x ≡ 1 (mod 12):
6 × 1 = 6 6 × 2 = 12 ≡ 0 6 × 3 = 18 ≡ 6
6 × 4 = 24 ≡ 0 6 × 5 = 30 ≡ 6 6 × 6 = 36 ≡ 0
...every result is 0 or 6. Never 1.
6 has no inverse mod 12. "Dividing by 6" is impossible.
Why? Because 6 and 12 share a common factor (6 = 2×3, 12 = 2²×3).
When a and p share a factor, a × x cycles through a subset of
values mod p and never hits 1. With a prime p, no number 1..p-1
shares a factor with p, so every number has an inverse.
The point addition formula requires dividing by (x_Q - x_P). If the modulus isn't prime, that difference could be a number with no inverse, and the formula would break. A prime modulus guarantees this never happens.
There's one special case: point doubling, where P = Q. You can't draw a line through a single point, so instead you use the tangent line to the curve at P. The slope of the tangent is dy/dx, which we get by implicitly differentiating the curve equation y² = x³ + ax + b:
DERIVING THE DOUBLING SLOPE:
════════════════════════════
Start with: y² = x³ + ax + b
Differentiate: 2y × (dy/dx) = 3x² + a
Solve for slope: dy/dx = (3x² + a) / (2y)
For secp256k1 (a = 0):
dy/dx = 3x² / (2y)
In mod-p arithmetic, this becomes s = (3x² + a) × (2y)⁻¹ mod p, where a is the coefficient from the curve equation y² = x³ + ax + b. For secp256k1, a = 0, so it simplifies to s = 3x² × (2y)⁻¹ mod p. The division by 2y is a modular inverse, just like in regular point addition. The rest of the calculation (x_R, y_R) uses the same formulas as before.
One more case: if P = (x, y), then -P = (x, p - y). Adding P and -P gives the point at infinity — the identity element of the group, analogous to zero in regular addition.
Now we get to the operation that makes cryptography possible.
Scalar multiplication means performing the point addition operation from Section 2.2 repeatedly: k · P = P + P + P + … + P (k times). The number k is called the scalar, and it will become the private key. Each "+" here is not ordinary number addition — it's the full elliptic curve point addition (compute a slope via modular inverse, find the new x-coordinate, find the new y-coordinate). Each one involves several modular multiplications on 256-bit numbers. A single point addition is fast (microseconds), but doing it trillions of times is not.
The high-level intuition: imagine you're standing on the curve at point P. Each point addition jumps you to a new point on the curve. After k jumps, you land on some point Q. The curve has no visible pattern — each jump looks random, even though it's completely deterministic. If someone shows you the starting point P and the ending point Q, you can't tell how many jumps it took. That's the trapdoor: jumping forward is easy (just keep adding), but figuring out the number of jumps from the endpoints is astronomically hard.
The naive approach — performing k-1 point additions one after another — would require as many point additions as the value of k itself. How big is k? It's a 256-bit number, meaning it's stored in 256 binary digits. The smallest 256-bit number is 2²⁵⁵ (a 1 followed by 255 zeros in binary), and the largest is 2²⁵⁶ - 1 (256 ones in binary). So k can be as large as ~2²⁵⁶ — a number with 77 decimal digits. Doing 2²⁵⁶ sequential point additions (each one a multi-step modular arithmetic computation) is completely infeasible; even at a billion point additions per second, it would take longer than the age of the universe.
But there's a much faster method called double-and-add. The key insight relies on the group properties we established in Section 2.2: because elliptic curve point addition is associative and commutative, we can regroup additions however we want. Adding P to itself 13 times can be regrouped as:
WHY REGROUPING WORKS:
═════════════════════
13 · P means: P + P + P + P + P + P + P + P + P + P + P + P + P
(13 copies of P, added using the geometric rule)
Because point addition is associative, we can regroup:
= (P+P+P+P+P+P+P+P) + (P+P+P+P) + P
= 8P + 4P + P
This is just like 13 = 8 + 4 + 1 in regular arithmetic.
It works because the GROUP PROPERTIES (associativity,
commutativity) hold for point addition — even though
the "addition" is a geometric operation, not coordinate addition.
And 8P is easy to compute: just double three times.
P → 2P → 4P → 8P (3 point doublings)
So: 13P = 8P + 4P + P, computed with 3 doublings + 2 additions
instead of 12 sequential additions.
The double-and-add algorithm automates this. It processes k's binary representation bit by bit. "Doubling" means adding a point to itself (P + P = 2P) using the tangent-line formula from above — this is just point addition where both inputs are the same point. The notation "2 × P" means "P + P" (two point additions), "6 × P" means "P + P + P + P + P + P" (but computed efficiently), and so on.
DOUBLE-AND-ADD: Computing 13 · P
═════════════════════════════════
13 in binary: 1 1 0 1
We process from the most significant bit to the least.
Start with result = ∞ (point at infinity = identity).
Bit 1 (leftmost):
Double: ∞ + ∞ = ∞ (adding identity to itself gives identity,
just like 0 + 0 = 0)
Bit is 1, so add P: ∞ + P = P (identity doesn't change P)
result = P
Bit 1:
Double: P + P = 2P (tangent line at P → third intersection → reflect)
Bit is 1, so add P: 2P + P = 3P
result = 3P
Bit 0:
Double: 3P + 3P = 6P (tangent line at 3P → new point)
Bit is 0, so skip the add.
result = 6P
Bit 1 (rightmost):
Double: 6P + 6P = 12P
Bit is 1, so add P: 12P + P = 13P
result = 13P ✓
Operations: 4 doublings + 3 additions = 7 total.
Naive method would need 12 additions.
Why does this work? Let's trace what the "result" variable holds after each step, and see how it builds up 13:
TRACING THE ACCUMULATOR:
════════════════════════
Step Binary What happened
──── ────── ─────────────
Start: 0 (empty — point at infinity)
Bit 1: double then add → 1 doubled 0→0, added 1→1
Bit 1: double then add → 11 doubled 1→10, added 1→11 (= 3)
Bit 0: double only → 110 doubled 11→110 (= 6), bit is 0 so no add
Bit 1: double then add → 1101 doubled 110→1100, added 1→1101 (= 13)
The pattern:
"double" shifts the binary left (appends a 0).
"add P" changes that trailing 0 to a 1.
double: 11 → 110 (appended 0)
add: 110 → 111 ← would happen if bit were 1,
but bit was 0 so we skip
double: 110 → 1100 (appended 0)
add: 1100 → 1101 (changed trailing 0 to 1)
The algorithm reads k's binary digits left to right,
reconstructing k one bit at a time:
1 → 11 → 110 → 1101 = 13 ✓
For a 256-bit scalar k, the binary representation has 256 bits. Each bit requires one doubling, and roughly half the bits are 1 (requiring an additional addition). So:
COST OF SCALAR MULTIPLICATION (256-bit k):
══════════════════════════════════════════
Doublings: 256 (one per bit)
Additions: ~128 (one per '1' bit, roughly half)
Total: ~384 point operations
Each point operation involves a modular inverse, a couple of
multiplications, a squaring, and several subtractions.
On modern hardware: ~1 millisecond for the entire thing.
The reduction: from 2²⁵⁶ operations (impossible)
to 384 operations (instant).
That's the power of binary decomposition.
384 operations. That's it. From a 256-bit number and a starting point, we can compute k · P in under a millisecond.
Now here's the trapdoor. Given P and the result Q = k · P, can you find k?
THE ONE-WAY PROPERTY:
═════════════════════
FORWARD (computing Q = k · P):
Input: k (256-bit number), P (point on curve)
Method: double-and-add
Cost: ~384 point operations
Time: ~1 millisecond
REVERSE (finding k given P and Q):
Input: P (point on curve), Q = k · P (point on curve)
Best classical algorithm: Pollard's rho
Cost: ~2¹²⁸ point operations (square root of the group order)
How long is that?
2¹²⁸ = 3.4 × 10³⁸ operations
At 10⁹ operations/second (aggressive estimate):
3.4 × 10³⁸ / 10⁹ = 3.4 × 10²⁹ seconds
= 1.08 × 10²² years
Age of the universe: 1.38 × 10¹⁰ years
Ratio: ~800 billion times the age of the universe
Even with a million parallel machines:
10²² / 10⁶ = 10¹⁶ years
Still 780,000× the age of the universe.
This asymmetry — forward is trivial, reverse is impossible — is the Elliptic Curve Discrete Logarithm Problem (ECDLP). The best classical attack is Pollard's rho algorithm, which runs in O(√n) time where n is the group order (roughly 2²⁵⁶ for secp256k1). That gives ~2¹²⁸ operations — far beyond what any classical computer can do.
This is the foundation of all elliptic curve cryptography. The private key is k. The public key is Q = k · P. Everyone can see Q and P (they're public). Nobody can find k. The security of Bitcoin, Ethereum, TLS, SSH, and Signal all rest on this single assumption.
Or rather, they rest on this assumption holding for classical computers.
Let's make this concrete. Here's how Bitcoin turns the ECDLP into a payment system.
Every Bitcoin user generates a key pair:
KEY GENERATION:
═══════════════
1. Pick a random 256-bit number k.
This is your PRIVATE KEY. Keep it secret.
2. Compute Q = k · G, where G is a fixed "generator point"
that's part of the secp256k1 specification.
This is your PUBLIC KEY.
G is publicly known — it's the same for everyone.
Q is unique to you (because k is unique to you).
3. Your Bitcoin ADDRESS is a hash of Q:
address = RIPEMD160(SHA256(Q))
RIPEMD160 and SHA256 are hash functions — they take any
input and produce a fixed-size fingerprint (160 bits and
256 bits respectively). The key property: given the hash,
you can't recover the original input. So the address hides
your public key Q behind two layers of hashing.
This means: even if someone knows your address, they can't
get your public key Q — until you spend from the address.
When you want to send Bitcoin, you create a digital signature using your private key k. The signature proves you own the funds without revealing k. Anyone can verify the signature using your public key Q — but they can't extract k from Q (because ECDLP is hard).
Let's trace a simplified version of how ECDSA (the Elliptic Curve Digital Signature Algorithm) works. But before diving into the formulas, let's understand what a digital signature is actually trying to do.
The problem: you want to prove to the world that you — the person who knows private key k — approved a specific message ("send 1 BTC to Alice"). The proof must satisfy three constraints: (1) anyone can verify it using only your public key Q, (2) it doesn't reveal k, and (3) it can't be reused for a different message.
The naive approach would be to just reveal k. The verifier checks Q = k · G and is convinced. But now everyone knows your private key — useless.
A better idea: instead of proving you know the discrete log of Q directly, prove you know it indirectly. Pick a random secret number r, compute a new point R = r · G (you know the discrete log of R because you chose r), and then produce a value s that could only be computed by someone who knows both r and k. The verifier can check s using only public information (Q, R, and the message hash), but can't extract k from it because r is unknown to them and discarded after signing.
That's the core idea. The specific formula for s is designed so that the verification equation works out to something computable from public data alone. Let's see how (RareSkills has an excellent full derivation if you want to see how the formula is discovered step by step):
ECDSA SIGNING — STEP 1: HASH THE MESSAGE
═════════════════════════════════════════
You want to sign a message m (e.g., "send 1 BTC to Alice").
h = SHA256(m) → a 256-bit number
This binds the signature to this specific message.
ECDSA SIGNING — STEP 2: PICK A RANDOM NONCE
════════════════════════════════════════════
Pick a random nonce r (a fresh random 256-bit number).
(Standard ECDSA notation uses k for the nonce, but we
use r to avoid confusion with the private key k.)
Compute R = r · G (scalar · point = a point on the curve).
Take the x-coordinate of R. Call it R_x.
R acts as a public "commitment" to the nonce r.
Everyone can see R, but they can't find r from R
(that would require solving the ECDLP on r).
Why do we need r when we already have the private key k? Imagine signing without a nonce — just publishing s = h × k mod n. This is regular modular multiplication, NOT elliptic curve point multiplication. And unlike EC point multiplication (where finding k from Q = k·G is the hard ECDLP), modular number multiplication is easily reversible — we showed how in Section 2.2. The attacker knows s (it's public) and h (they can compute SHA256 of the message). So k = s × h⁻¹ mod n. One modular inverse (trivial to compute), and your private key is exposed.
The nonce r fixes this by masking k inside the formula. With r in the mix, the attacker sees s, h, and R_x, but can't solve for k without also knowing r. And r is random, used once, and discarded. "Nonce" literally means "number used once" — and it MUST be different every time. If you reuse a nonce for two different messages, the attacker can eliminate r from the two equations and solve for k. (In 2010, Sony reused a nonce in the PS3's ECDSA implementation. Hackers extracted the private key and broke PS3 code signing permanently.)
ECDSA SIGNING — STEP 3: COMPUTE THE SIGNATURE VALUE
════════════════════════════════════════════════════
s = r⁻¹ × (h + R_x × k) mod n
where n ≈ 2²⁵⁶ is the "order" of the curve (total number
of points in the group). We work mod n because scalar
multiplication wraps around after n additions.
Why this specific formula? The goal is to create a value that depends on k and h, can be verified using only public information, and doesn't reveal k. Working backwards from what the verifier needs:
DERIVING THE FORMULA FROM THE VERIFICATION REQUIREMENT:
═══════════════════════════════════════════════════════
The verifier knows h, R_x, s, G, and Q. They need to
reconstruct R = r · G from these public values.
number×number scalar·point
───────────── ────────────
r · G = s⁻¹×(h + R_x×k) · G (rearranging s formula)
= (s⁻¹×h) · G + (s⁻¹×R_x) · (k·G) (distributing)
= (s⁻¹×h) · G + (s⁻¹×R_x) · Q (Q = k·G)
That last line uses only public values (h, s, R_x, G, Q).
The formula for s is designed so this works out.
ECDSA SIGNING — STEP 4: PUBLISH
═══════════════════════════════
The signature is the pair (R_x, s).
Publish (R_x, s) along with the message m.
Your private key k is buried inside s, but extracting
it requires knowing r — which you discard after signing.
ECDSA VERIFICATION (what the network checks):
══════════════════════════════════════════════
Given: message m, signature (R_x, s), public key Q.
1. Compute the hash: h = SHA256(m)
2. Compute:
u₁ = h × s⁻¹ mod n (message part)
u₂ = R_x × s⁻¹ mod n (commitment part)
3. Compute the verification point:
V = u₁ · G + u₂ · Q
4. Check: does V's x-coordinate equal R_x?
If yes → signature is valid.
If no → signature is forged or corrupted.
Why does this work? Let's trace the algebra:
WHY VERIFICATION WORKS:
═══════════════════════
V = u₁ · G + u₂ · Q (scalar · point)
= (h/s) · G + (R_x/s) · Q (substituting u₁, u₂)
= (h/s) · G + (R_x/s) · (k·G) (substituting Q = k·G)
= ((h + R_x×k) / s) · G (factoring out G)
= ((h + R_x×k) / (r⁻¹×(h + R_x×k))) · G (substituting s)
= r · G (cancels!)
= R ✓
The verification uses only the public key Q — it never needs the private key k. But critically, the verification is done against a specific Q. On the Bitcoin network, each unspent output (each "coin") is locked to a specific address, which is derived from a specific public key. When someone broadcasts a transaction spending those coins, the network checks the signature against the public key that owns those coins, not just any public key.
So if John tries to sign a message "send Bob's coins to Alice" using John's private key, the math would check out against John's public key Q_john — but the network would verify against Bob's public key Q_bob (the one tied to the coins). The pairing equation would fail because the signature was made with the wrong key. It's like forging someone's signature on a check — the bank compares it to the account holder's signature on file, not the forger's.
This is why the quantum attack matters: the attacker needs Bob's specific private key k_bob to sign a valid transaction spending Bob's coins. And to get k_bob, they need to solve Q_bob = k_bob × G — the ECDLP for Bob's public key.
Here's where the vulnerability appears. To verify the signature, the network needs Bob's public key Q. So when Bob broadcasts a transaction, Q is included in the transaction data and becomes visible to everyone on the network.
BITCOIN TRANSACTION LIFECYCLE:
══════════════════════════════
t = 0 seconds:
You sign a transaction with your private key k.
The transaction includes your public key Q.
You broadcast it to the Bitcoin network.
t = 0 to ~600 seconds (0-10 minutes):
The transaction sits in the MEMPOOL — a public waiting area.
Every node on the network can see your public key Q.
*** VULNERABLE WINDOW ***
Anyone who can solve Q = k · G for k can sign a
competing transaction stealing your funds.
t ≈ 600 seconds:
A miner includes your transaction in a block.
The block is added to the blockchain.
Your transaction is now CONFIRMED.
Too late for an attacker — the funds have moved.
The average Bitcoin block time is about 10 minutes. That's the window. If an attacker can solve the ECDLP — compute k from Q — in less than 10 minutes, they can steal the funds before the transaction confirms.
The Google paper claims this can be done in 9 minutes.
We'll derive the exact attack probability in Part 2, Section 5.3 (coming soon). For now, the key point is: the security of every Bitcoin transaction rests on the ECDLP being unsolvable within a 10-minute window. The paper says a quantum computer could solve it in 9.
We now know what we're breaking: the ECDLP, the problem of finding k given P and Q = k · P. Classically, this takes ~2¹²⁸ operations. Quantum mechanically, Shor's algorithm reduces it to something tractable. But how many quantum operations does "tractable" actually mean?
We won't derive Shor's algorithm step by step here — that's coming in Zero to Quantum Part 4, where we'll build it from the Quantum Fourier Transform up. But you deserve more than "it's a black box." Let's build a real intuition for why a quantum computer can solve the ECDLP, and what it actually does.
The core idea behind Shor's algorithm is period finding. It turns out that many hard problems in cryptography — integer factoring, discrete logarithms, and the ECDLP — can be reduced to finding a hidden period in a mathematical function. And finding periods is something quantum computers are extraordinarily good at.
Let's build this up with an analogy. Imagine you have a function that takes a number as input and produces a color as output. The function is periodic — the colors repeat in a cycle — but you don't know the period. Classically, you'd have to evaluate the function at many different inputs and look for the repeat. If the period is large, this takes a long time.
A quantum computer does something fundamentally different. It prepares a superposition of all possible inputs and evaluates the function on this superposition in a single pass. But — and this is crucial — the speedup does NOT come from "trying all inputs at once and reading the results." You can't read them all out; measurement would collapse the superposition and give you just one random input-color pair. Instead, the quantum computer applies the Quantum Fourier Transform (QFT), which converts the superposition from the "input domain" to the "frequency domain." The hidden period shows up as a sharp peak in the frequency domain. Measure, and you get the period. The power is in the interference pattern created by the QFT, not in parallel evaluation.
SHOR'S ALGORITHM — THE CORE IDEA:
══════════════════════════════════
1. SUPERPOSITION: Prepare a quantum register in a superposition
of all possible inputs (0, 1, 2, ..., N-1) simultaneously.
2. EVALUATE: Apply the function f(x) to this superposition.
The quantum computer computes f(x) for ALL x at once.
Result: a quantum state encoding all (x, f(x)) pairs.
3. QUANTUM FOURIER TRANSFORM: Apply the QFT to the input register.
This converts the state from the "position" basis to the
"frequency" basis. If f has period r, the QFT produces
sharp peaks at multiples of N/r.
4. MEASURE: Read out the frequency register.
You get a value close to some multiple of N/r.
Classical post-processing (continued fractions) extracts r.
The magic is in steps 2-3 together: the quantum computer
evaluates f on a superposition of all inputs, then the QFT
extracts the hidden period through interference. The speedup
doesn't come from "trying all inputs at once" (you can't read
them all out — measurement gives just one random result).
It comes from the QFT converting the periodic structure of
the superposition into a measurable frequency peak.
Now, how does this apply to the ECDLP? We want to find k such that Q = k · P. Here's the connection:
SHOR'S FOR ECDLP — THE REDUCTION:
══════════════════════════════════
Define a function on two variables (a, b):
f(a, b) = a · P + b · Q
where P is the generator and Q = k · P is the public key.
This function has a hidden structure:
f(a, b) = f(a + k, b - 1)
because:
(a + k) × P + (b - 1) × Q
= a·P + k·P + b·Q - Q
= a·P + Q + b·Q - Q (since k·P = Q)
= a·P + b·Q
= f(a, b)
So the function is periodic with period (k, -1).
Finding this period reveals k — the private key.
Shor's algorithm:
1. Superpose all (a, b) pairs
Each of a and b ranges from 0 to ~2²⁵⁶, so each needs
a 256-bit quantum register. These two registers are put
into superposition of all possible values simultaneously.
2. Evaluate f(a, b) = a·P + b·Q in superposition
3. Apply QFT to both registers
4. Measure → extract the period (k, -1) → recover k
Step 2 is where all the cost lives. The quantum computer must compute a·P + b·Q — which is two scalar multiplications and one point addition — for a superposition of all possible (a, b) values. This means performing elliptic curve arithmetic on 256-bit numbers, entirely within a quantum circuit, without any intermediate measurements.
The key word is coherently. Every intermediate value must remain in quantum superposition — no peeking at partial results. The entire scalar multiplication must be compiled into a sequence of quantum gates that runs start to finish without measurement. The question becomes: how many quantum gates does that require?
In Section 2.3, we saw that classical scalar multiplication uses the double-and-add method: 256 doublings + ~128 additions = ~384 point operations. The quantum circuit could do the same thing, but there's a much better approach.
The idea is windowed multiplication. Instead of processing k one bit at a time, we process it in chunks. Let's see how this works with a small example first, then scale up to 256 bits.
EXAMPLE: COMPUTING 203 × P WITH WINDOW SIZE w = 4
══════════════════════════════════════════════════
203 in binary: 1100 1011 (8 bits)
DOUBLE-AND-ADD (bit by bit):
Process one bit at a time.
8 doublings + 5 additions = 13 operations.
WINDOWED (4-bit chunks):
Split into two 4-bit chunks:
High chunk: 1100 = 12 (decimal)
Low chunk: 1011 = 11 (decimal)
203 = 12 × 2⁴ + 11 = 12 × 16 + 11
So: 203 × P = 12 × 16 × P + 11 × P
= 12 × (16P) + 11 × P
If we precompute tables of multiples, we just need
two lookups and one addition!
The key insight: instead of processing 203 bit by bit (13 operations), we split it into two 4-bit chunks and look up the answer for each chunk in a precomputed table. But we need the tables to account for the position of each chunk — the high chunk represents 12 × 16, not just 12. So we build position-shifted tables:
And here's why this is practical: P is the generator point G from the secp256k1 specification — a fixed, publicly known constant. It's the same for every Bitcoin user, published as part of the standard. So all these table entries (multiples of G, multiples of 16G, etc.) can be computed classically in advance — they're just repeated point additions on known values. The resulting tables are then loaded into the quantum circuit as QROM (Quantum Read-Only Memory), so the quantum computer can look up entries in superposition during Shor's algorithm.
POSITION-SHIFTED LOOKUP TABLES:
═══════════════════════════════
TABLE 0 (for the LOW chunk, position 0):
Stores multiples of P (the base point).
Table0[0] = ∞ (identity)
Table0[1] = 1 × P
Table0[2] = 2 × P
...
Table0[11] = 11 × P ← we need this one
...
Table0[15] = 15 × P
TABLE 1 (for the HIGH chunk, position 1):
Stores multiples of 2⁴ × P = 16P.
Table1[0] = ∞
Table1[1] = 1 × 16P = 16P
Table1[2] = 2 × 16P = 32P
...
Table1[12] = 12 × 16P = 192P ← we need this one
...
Table1[15] = 15 × 16P = 240P
Each table has 2⁴ = 16 entries (one for each possible
4-bit value). The tables are precomputed once.
THE COMPUTATION:
Look up low chunk (11) in Table 0: → 11P
Look up high chunk (12) in Table 1: → 192P
Add them: 11P + 192P = 203P ✓
Operations: 2 table lookups + 1 point addition.
Compare to double-and-add: 13 operations.
That's a 6.5× reduction!
Now scale this up to 256 bits with window size w = 16:
SCALING TO 256 BITS (window size w = 16):
═════════════════════════════════════════
Split k (256 bits) into 256/16 = 16 chunks of 16 bits each.
Build 16 position-shifted tables:
Table 0: multiples of P (2¹⁶ = 65,536 entries)
Table 1: multiples of 2¹⁶ × P (65,536 entries)
Table 2: multiples of 2³² × P (65,536 entries)
...
Table 15: multiples of 2²⁴⁰ × P (65,536 entries)
The computation:
Look up each chunk in its corresponding table → 16 lookups
Add all 16 results together → 15 additions
Total: 16 lookups + 15 point additions.
Compare to double-and-add: ~384 operations.
That's a ~25× reduction in point additions!
The catch: in a quantum circuit, a "table lookup" isn't free. It's a quantum operation called QROM (Quantum Read-Only Memory) that must be implemented in gates. But QROM is much cheaper than a full point addition, so the tradeoff is worth it. The total number of expensive point additions drops from ~384 to ~15.
THE PAPER'S OPTIMIZED COUNT:
════════════════════════════
Window size: w = 16 (each chunk is 16 bits wide)
Number of windows: 256 / 16 = 16 windows
(coincidentally the same number!)
Per scalar multiplication:
Point additions: ~15 (one per window, minus one)
QROM table lookups: 16 (one per window — cheaper than
point additions, but not free)
Shor's algorithm for ECDLP requires TWO interleaved scalar
multiplications (one for each register — see Section 3.1).
Total quantum point additions: 28 (paper's stated number)
Our estimate: 15 per multiplication × 2 = 30 (close but
not exact — the precise count depends on circuit details).
Total QROM lookups: ~32
(16 per multiplication × 2)
The 28 point additions are the expensive operations.
The 32 QROM lookups are cheaper per-operation but still
contribute to the total gate count.
Compare to naive: ~384 point additions → 28 additions + 32 lookups.
The most expensive operations are reduced by ~14×.
The paper states 28 total quantum point additions for the low-qubit circuit variant. Our rough estimate (15 per multiplication × 2 = 30) is close but not exact — the precise count depends on circuit-level details the paper does not fully publish.
Each quantum point addition on 256-bit numbers is a substantial circuit. To see why, let's trace what a single modular multiplication looks like in quantum gates, starting small.
In a quantum circuit, the basic "expensive" gate is the Toffoli gate — a 3-qubit gate that flips the third qubit if and only if the first two are both 1. When the third qubit starts at 0, this computes a AND b — but unlike a classical AND gate, the Toffoli is reversible (no information is destroyed). This reversibility is required for quantum computation, and it's precisely what makes the gate expensive to implement with error correction. (CNOT gates, which implement XOR, are essentially free in the surface code — more on that in Part 2 (coming soon).)
A 2-BIT MODULAR MULTIPLICATION IN QUANTUM GATES:
═════════════════════════════════════════════════
Multiply a = a₁a₀ by b = b₁b₀ (both 2-bit numbers).
Long multiplication, same as you learned in school:
a₁ a₀
× b₁ b₀
───────────
a₁∧b₀ a₀∧b₀ ← a × b₀
a₁∧b₁ a₀∧b₁ 0 ← a × b₁, shifted left
+ ────────────────────
c₃ c₂ c₁ c₀ ← 4-bit result
∧ means AND — single-bit multiplication.
In a quantum circuit, each AND is one Toffoli gate.
Step 1 — Partial products:
4 AND operations: a₁∧b₀, a₀∧b₀, a₁∧b₁, a₀∧b₁
= 4 Toffoli gates
Step 2 — Adding the two rows:
c₀ = a₀∧b₀ (no addition needed)
c₁ = (a₁∧b₀) XOR (a₀∧b₁) (XOR = CNOT = free)
carry = (a₁∧b₀) AND (a₀∧b₁) (1 Toffoli for the carry)
c₂ = (a₁∧b₁) XOR carry (free)
c₃ = (a₁∧b₁) AND carry (1 Toffoli)
= 2 Toffoli gates for carries
Step 3 — Modular reduction (if result ≥ p, subtract p):
Compare 4-bit result to p: ~1 Toffoli
Conditionally subtract p: ~1 Toffoli
= ~2 Toffoli gates
TOTAL: 4 + 2 + 2 = 8 Toffoli gates for 2-bit × 2-bit mod p.
For 2 bits, we can count every gate. For larger bit widths, the exact count depends on the specific adder architecture (ripple-carry, carry-lookahead, etc.), but the scaling is clear: the partial products alone cost n² Toffoli gates (each bit of a ANDed with each bit of b), and the carry propagation adds a similar amount. The total grows as O(n²) — quadratic in the bit width:
SCALING TO 256 BITS:
════════════════════
The dominant cost is O(n²) from partial products + carries.
2-bit: ~8 Toffoli (counted exactly above)
256-bit: ~130,000 Toffoli (from the literature — the exact
count depends on the multiplication algorithm used;
schoolbook gives ~2n² ≈ 131,000, but optimized
methods like Karatsuba can reduce this somewhat)
The key point: one 256-bit modular multiplication costs
roughly 100,000+ Toffoli gates. And a single point addition
needs MANY of these multiplications.
And a single point addition from Section 2.2 requires multiple modular multiplications:
OPERATIONS IN ONE POINT ADDITION:
═════════════════════════════════
From the formulas in Section 2.2:
Slope: s = (y₂-y₁) × (x₂-x₁)⁻¹ → 1 inversion + 1 multiplication
x_R: s² - x₁ - x₂ → 1 squaring (= multiplication)
y_R: s×(x₁ - x_R) - y₁ → 1 multiplication
Subtotal: 3 multiplications + 1 inversion.
The inversion (computing (x₂-x₁)⁻¹ mod p) is the expensive part.
In Section 2.2, we showed that a⁻¹ mod p = a^(p-2) mod p
(Fermat's little theorem). So computing an inverse means
computing a^(p-2) — raising a number to a huge power.
How do you compute a^(p-2) efficiently? The same trick as
double-and-add from Section 2.3, but for exponentiation:
instead of "double the point, add P," it's "square the
number, multiply by a."
Small example: compute a^13 (13 in binary = 1101)
Start: result = 1
Bit 1: square → 1, multiply by a → a (= a¹)
Bit 1: square → a², multiply by a → a³ (= a³)
Bit 0: square → a⁶, skip (= a⁶)
Bit 1: square → a¹², multiply by a → a¹³ (= a¹³) ✓
4 squarings + 3 multiplications = 7 operations.
Naive (multiply a × a × a... 12 times) = 12 operations.
p-2 is a 256-bit number, so:
Squarings: 256 (one per bit of the exponent)
Multiplications: ~128 (one per '1' bit, roughly half)
Total for one inversion: ~384 modular multiplications
Now add up everything for one point addition:
Slope (s): 1 inversion (384 mults) + 1 multiplication = 385
x_R: 1 squaring (= 1 multiplication) = 1
y_R: 1 multiplication = 1
───
Total: ~387 modular multiplications per point addition
(We round to ~390 to account for a few extra operations
in coordinate handling.)
At ~130,000 Toffoli gates per multiplication:
Naive cost: 390 × 130,000 ≈ 51 million Toffoli gates per addition!
The exact combination of these (and likely other) techniques is the paper's unpublished contribution. The paper states the total circuit cost: ~90 million Toffoli gates on ~1,200 logical qubits. It does not publish a per-addition breakdown, so we cannot verify exactly how the 90M is distributed across the 28 additions and overhead operations. What we can say is that 90M gates across 28 additions averages to ~3.2M per addition — dramatically less than our naive ~50M estimate, confirming that substantial optimizations are at work.
COST PER QUANTUM POINT ADDITION (paper's numbers):
═══════════════════════════════════════════════════
Naive estimate: ~50 million Toffoli gates
Paper's total for all 28: ~90 million Toffoli gates
Average per addition: ~90M / 28 ≈ 3.2 million
(this average includes all overhead — QROM lookups,
coordinate conversions, etc. — not just the point
addition arithmetic itself)
Improvement over naive: ~15× (from ~50M to ~3.2M per addition)
Why only count Toffoli gates?
Clifford gates (H, S, CNOT) are essentially free compared to Toffoli in the surface code.
They can be implemented transversally — directly on the
encoded qubits with negligible overhead.
Non-Clifford gates (T gates, Toffoli gates) are expensive.
They require "magic states" produced by dedicated hardware.
The Toffoli count is what determines the physical cost.
LOGICAL QUBIT COUNT: WHERE ~1,200 COMES FROM
═════════════════════════════════════════════
Logical qubits = how many quantum bits are in use simultaneously.
Think of it as the "width" of the circuit — how many wires
are active at the same time.
The main components we can identify:
1. INPUT REGISTERS (for Shor's algorithm):
Two 256-bit quantum registers for the (a, b) values
in f(a,b) = a·P + b·Q (from Section 3.1).
These are in SUPERPOSITION — all 2²⁵⁶ values at once.
= 2 × 256 = 512 qubits
2. WORKING REGISTERS (for the arithmetic):
The circuit computes a·P + b·Q, which involves point
additions and modular multiplications. The intermediate
point coordinates (x, y — each 256 bits) must be stored
in quantum registers because they're entangled with the
superposed (a, b) inputs.
= 2 × 256 = 512 qubits
Note: P and Q themselves are NOT quantum. They're fixed
classical values (P = generator, Q = public key) loaded
into the circuit as constants. Only the values that DEPEND
on the superposed inputs need quantum registers.
3. SCRATCH SPACE + ANCILLA QUBITS:
Modular multiplication needs temporary storage for
intermediate values. Also, quantum circuits must be
reversible — intermediate values can't be overwritten,
they must be "uncomputed" after use.
= the remaining ~176 qubits (1,200 - 512 - 512)
TOTAL: ~1,200 logical qubits (paper's stated number)
The paper states the total but does not publish the internal
breakdown. The 512 + 512 for input and working registers is
our estimate based on the data widths; the remainder (~176)
covers scratch space, ancilla, and any other overhead.
(The low-gate variant uses ~1,450 logical qubits and ~70 million
Toffoli gates — more qubits but fewer gates. The tradeoff:
extra scratch space allows the circuit to avoid recomputing
intermediate values, reducing the gate count at the cost of
more qubits.)
Let's put together what we know about the total gate count:
TOTAL TOFFOLI GATE COUNT:
═════════════════════════
POINT ADDITIONS (the dominant cost):
From Section 3.2: the windowed approach reduces the number
of point additions from ~384 to 28 (the paper's stated count
for the low-qubit circuit variant).
From Section 3.3: a naive point addition costs ~50 million
Toffoli gates. The paper's optimized circuit reduces this
to an average of ~3.2 million per addition (90M total / 28
additions). The specific optimizations are not published.
THE PAPER'S STATED TOTAL:
~90,000,000 Toffoli gates on ~1,200 logical qubits.
This 90M includes everything: the 28 point additions,
the QROM table lookups, coordinate conversions, and all
modular arithmetic. The paper does not publish a breakdown
of how the 90M is distributed across these components.
90 million Toffoli gates on ~1,200 logical qubits. These two numbers together define the quantum circuit. Let's connect them.
The ~1,200 qubits are the width of the circuit — how many quantum wires exist simultaneously. We estimated the breakdown in Section 3.3: ~512 qubits for the two input registers (a and b in superposition), ~512 for the working point coordinates, and the remainder for scratch space and reversibility overhead. These 1,200 qubits persist throughout the entire computation.
The 90 million Toffoli gates are the depth — how many operations are applied to those wires. Each Toffoli gate picks 3 of the 1,200 qubits to operate on, then the next gate picks a (possibly different) 3, and so on, 90 million times. The qubits are reused across gates, not consumed. Think of it like a factory with 1,200 workstations: each step involves 3 workstations, but the factory itself doesn't grow — the same workstations are used over and over.
The challenge isn't the number of operations — a classical computer could do 90 million multiplications in a fraction of a second. The challenge is that all 1,200 qubits must maintain their quantum state coherently across all 90 million steps, without any qubit losing its information to noise. A single error at step 50 million could corrupt the final answer. That's why we need error correction — the subject of Part 2.
Let's take stock. Starting from the raw curve equation, we traced an unbroken chain to the headline number:
| Step | What we showed | Key number |
|---|---|---|
| The curve | y² = x³ + 7 over a finite field (mod p) | p is 256 bits |
| Point addition | Geometric rule → algebraic formulas, verified by hand | (1,10) + (10,8) = (20,7) |
| Scalar multiplication | Double-and-add reduces k additions to log₂(k) | ~384 naive operations |
| The trapdoor | Forward is fast, reverse is impossible classically | ~1ms vs ~10²² years |
| Bitcoin & ECDSA | Private key k, public key Q = k · G, 10-min window | ~6.9M BTC exposed |
| Shor's algorithm | Period-finding on f(a,b) = a·P + b·Q | Two 256-bit registers |
| Windowed arithmetic | Process k in 16-bit chunks with lookup tables | 384 → 28 additions |
| Gate cost | ~130K Toffoli per multiply, ~390 multiplies per addition | ~50M naive per addition |
| Paper's result | Optimized circuit (details unpublished) | ~90M Toffoli total, ~1,200 logical qubits |
We now know what the quantum computer needs to do (90 million Toffoli gates on 1,200 logical qubits) but not yet how much hardware that requires in the real world, or how long it takes.
In Part 2 (coming soon), we derive the physical qubit count (why 1,200 logical qubits become 500,000 physical), the 9-minute runtime, why the attack succeeds only ~41% of the time (it must finish before the next Bitcoin block is mined — a race against a random Poisson clock), the zero-knowledge proof mechanism, the hardware reality check, and the post-quantum migration path.
Part 2: From 90 Million Gates to 9 Minutes — coming soon
@misc{fofadiya2026ecdlp,
author = {Darshan Fofadiya},
title = {How 500,000 Qubits Could Break Bitcoin (Part 1): Elliptic Curves and the Quantum Attack},
year = {2026},
url = {https://darshanfofadiya.com/research-papers/google-ecdlp/}
}
Get notified when I publish new deep dives
Illustrated guides on LLM inference, quantum computing, and AI research papers.
⚠ After subscribing, check your inbox for a confirmation email. You won't receive posts until you confirm.