How 500,000 Qubits Could Break Bitcoin

Part 1: Elliptic curves, Shor's algorithm, and 90 million Toffoli gates

By Darshan Fofadiya · April 2026

1. The Headline: 9 Minutes to Break Bitcoin
1.1 What Google Published
1.2 What's at Stake
1.3 What This Article Derives
2. Elliptic Curve Cryptography
2.1 What Is an Elliptic Curve?
2.2 Point Addition
2.3 Scalar Multiplication — The Trapdoor
2.4 How Bitcoin Uses This
3. Shor's Algorithm for ECDLP
3.1 How Shor's Algorithm Works
3.2 Windowed Arithmetic
3.3 Cost Per Point Addition
3.4 The Full Circuit

1. The Headline: 9 Minutes to Break Bitcoin

On March 30, 2026, five researchers from Google Quantum AI, UC Berkeley, the Ethereum Foundation, and Stanford — Ryan Babbush and Craig Gidney at Google Quantum AI, alongside Thiago Bergamaschi (UC Berkeley), Justin Drake (Ethereum Foundation), and Dan Boneh (Stanford) — published a paper with a simple title and an extraordinary claim: a quantum computer with roughly 500,000 physical qubits could break 256-bit elliptic curve cryptography — specifically, solve the Elliptic Curve Discrete Logarithm Problem (ECDLP) — in about 9 minutes.

That number matters because of what 256-bit elliptic curve cryptography protects. Every Bitcoin transaction is signed with it. Every Ethereum wallet depends on it. When your browser shows a padlock icon, the key exchange that secures that connection almost certainly uses it. SSH, Signal, WhatsApp, FIDO2 security keys — all of them rely on the same mathematical problem being hard to solve.

The paper doesn't announce a working attack. No quantum computer today has anywhere near 500,000 qubits — the largest machines have around 1,200 physical qubits (coincidentally the same number as the paper's logical qubit count, but these are noisy physical qubits — not the error-corrected logical qubits the paper requires). But previous estimates for breaking this cryptography required millions of qubits and hours of runtime. This paper brings the requirement down to half a million qubits and single-digit minutes. That's a qualitative shift — from "far future" to "possibly this decade."

In this article, we'll trace the paper's claims as far as we can from first principles — building the cryptography from scratch, deriving the naive gate costs, and identifying exactly where the paper's unpublished optimizations close the gap. By the end, you'll understand the chain from elliptic curves to 90 million Toffoli gates, and what remains for Part 2: physical qubits, runtime, and the 41% attack probability.

1.1 What Google Published

The paper's full title is "Securing Elliptic Curve Cryptocurrencies against Quantum Vulnerabilities: Resource Estimates and Mitigations." The authors span Google Quantum AI, UC Berkeley, the Ethereum Foundation, and Stanford — a collaboration between quantum hardware researchers and leading cryptographers. Here's what they claim:

THE GOOGLE ECDLP PAPER — KEY NUMBERS:
══════════════════════════════════════
Target:              256-bit elliptic curve (secp256k1, P-256)
Algorithm:           Shor's algorithm, adapted for elliptic curves
Logical qubits:      ~1,200
Physical qubits:     ~500,000 (using error correction)
Toffoli gates:       ~90 million
Runtime:             ~9 minutes (superconducting qubits)

We'll trace each of these numbers as far as we can. But first, let's understand what they mean at a high level.

The target is the specific mathematical problem that protects Bitcoin and most of the internet's cryptography. The algorithm is a quantum algorithm discovered by Peter Shor in 1994, adapted here for elliptic curves. Logical qubits are perfect, error-free qubits — the idealized units of quantum computation. Physical qubits are the real, noisy hardware qubits needed to implement those logical qubits using error correction. The gap between 1,200 and 500,000 is the cost of making imperfect hardware behave perfectly — and understanding that gap is a major part of this article.

Toffoli gates are the expensive quantum operations in the circuit. T-state factories are dedicated hardware modules that produce the resources those gates consume. And the runtime is how long the whole computation takes on a fast superconducting quantum computer.

This is a resource estimate, not a demonstration. No quantum computer was used. The authors verified their circuit using a classical simulation combined with a zero-knowledge proof — a cryptographic technique that proves the circuit works without revealing how it works. We'll explain that mechanism in Part 2, Section 6 (coming soon).

1.2 What's at Stake

The cryptography this paper targets isn't obscure. It's the most widely deployed public-key cryptography in the world.

System	Curve	What it protects
Bitcoin	secp256k1	Transaction signatures ($600B+ market cap)
Ethereum	secp256k1	Wallet signatures ($200B+ market cap)
TLS 1.3 (HTTPS)	P-256 / X25519	Every secure web connection
SSH	Ed25519	Remote server access
Signal / WhatsApp	X25519	End-to-end encrypted messages
FIDO2 / WebAuthn	P-256	Hardware security keys

All of these curves are 256-bit. All of them are vulnerable to Shor's algorithm for ECDLP. The paper's specific 500,000-qubit estimate targets secp256k1, but the same approach applies to other 256-bit curves with similar resource requirements.

The most concrete threat is to Bitcoin. When you send a Bitcoin transaction, your public key is broadcast to the entire network. It sits in the mempool — the waiting area for unconfirmed transactions — for roughly 10 minutes until a miner includes it in a block. During those 10 minutes, anyone who can see the public key and solve the underlying math problem can steal the funds.

But there's a subtler, larger threat. Any Bitcoin address that has ever sent a transaction already has its public key permanently recorded on the blockchain. An attacker doesn't need to race the 10-minute clock for those addresses — they can work on them at leisure, with no time pressure at all.

BITCOIN EXPOSURE:
═════════════════
Addresses that have NEVER sent a transaction:
  Public key = Hash(actual public key)
  Attacker sees only the hash → cannot extract the public key
  SAFE (until you spend from the address)

Addresses that HAVE sent at least one transaction:
  Public key is on the blockchain, permanently, in the clear
  Attacker has unlimited time to compute the private key
  Estimated: ~6.9 million BTC in such addresses
  (per on-chain analyses as of early 2026 — this number
   shifts as coins move between address types)
  At current prices: hundreds of billions of dollars at risk, with NO time pressure

Ethereum faces a similar exposure. Unlike Bitcoin's UTXO model where each address can theoretically be used once, Ethereum uses an account model — the same address (and therefore the same public key) is reused for every transaction. Once you've sent a single transaction from an Ethereum account, your public key is permanently exposed.

And then there's the "harvest now, decrypt later" problem. Even for systems like TLS where keys are ephemeral (used once and discarded), a well-resourced adversary can record encrypted traffic today and decrypt it years later when a quantum computer becomes available. If your data needs to stay secret for more than a decade — diplomatic communications, medical records, trade secrets — the quantum threat is already relevant.

This paper doesn't announce a working attack. No quantum computer today has 500,000 qubits. But it moves the goalpost from "millions of qubits, decades away" to "half a million qubits, possibly this decade." That's a qualitative shift in urgency.

1.3 What This Article Derives

Here's the plan. We'll work through the paper's claims in the order you'd naturally ask about them:

What exactly is being broken? We'll build elliptic curve cryptography from scratch — the curve equation, point addition, scalar multiplication, and why it's a one-way function. You'll compute a point addition by hand. (Section 2)
How does the quantum attack work? We'll trace how 256-bit scalar multiplication becomes 28 windowed point additions, totaling ~90 million Toffoli gates. (Section 3)
Where do 500,000 physical qubits come from? We'll derive the surface code error correction overhead — why 1,200 logical qubits become 500,000 physical qubits. (Part 2, Section 4 (coming soon))
Why 9 minutes? We'll trace the runtime from sequential circuit depth through code cycle times to wall-clock minutes. The attack must complete before the next Bitcoin block is mined (~10 minutes), giving a ~41% success probability per attempt — we'll derive this from the Poisson statistics of block mining. (Part 2, Section 5 (coming soon))
How do we know the circuit works? The paper uses a zero-knowledge proof. We'll explain the mechanism from scratch. (Part 2, Section 6 (coming soon))
When could this actually happen? Current hardware, Google's Willow chip, and the scaling trajectory. (Part 2, Section 7 (coming soon))
What do we do about it? Post-quantum standards, Bitcoin's migration problem, and what's already happening. (Part 2, Section 8 (coming soon))

One thing we will not do: derive Shor's algorithm. Shor's is the quantum algorithm that makes this attack possible, and it deserves its own deep treatment. We're covering it in our Zero to Quantum series. In this article, we treat Shor's as a black box: it needs a certain number of elliptic curve operations performed in quantum superposition, and we trace everything from that point forward.

With that contract established, let's start with the cryptography itself.

2. Elliptic Curve Cryptography — What We're Breaking

Everything in this section is classical mathematics — no quantum computing, no physics. If you can follow arithmetic with a calculator, you can follow this. We're building the lock before we show how to pick it.

Notation: This article uses two kinds of multiplication that look similar but are completely different. To keep them straight:
• × (times sign) — ordinary multiplication of numbers. Example: 7 × 8 = 56. This includes modular arithmetic like 7 × 8 mod 23 = 10. Easily reversible.
• · (dot) — elliptic curve scalar multiplication, where a number "multiplies" a point on the curve. Example: k · P means "add P to itself k times using the geometric rule." This is the hard one-way operation that cryptography relies on.
When you see ×, think "calculator math." When you see ·, think "jumping along the curve."

2.1 What Is an Elliptic Curve?

An elliptic curve is defined by an equation of the form:

y² = x³ + ax + b

If you've seen the equation for an ellipse before — x²/a² + y²/b² = 1 — this looks different, and it is. Despite the name, elliptic curves are not ellipses. The name is historical, coming from the study of elliptic integrals (which arise when computing the arc length of an ellipse). The curve itself is a cubic — the x³ term gives it a distinctive shape with a smooth loop or a single swooping branch, quite unlike the closed oval of an ellipse.

For different values of a and b, you get different curves. Bitcoin uses a curve called secp256k1, which has a = 0 and b = 7:

BITCOIN'S CURVE (secp256k1):
════════════════════════════
y² = x³ + 7

If you plot this over the real numbers, you get a smooth curve — symmetric about the x-axis (because if y satisfies the equation, so does -y). It has one connected branch that extends infinitely to the right and curves through the y-axis.

But cryptography doesn't use real numbers. Real numbers have infinite precision, which is impractical for computers. Instead, we work in a finite field — we do all arithmetic modulo a large prime number p. This means every calculation wraps around: if a result exceeds p, we take the remainder after dividing by p.

MODULAR ARITHMETIC — THE CLOCK ANALOGY:
═══════════════════════════════════════
Regular arithmetic:  15 + 20 = 35
Modular (mod 23):    15 + 20 = 35 mod 23 = 12

It's like a clock with 23 hours instead of 12.
When you go past 23, you wrap around to the beginning.

Regular arithmetic:  7 × 8 = 56
Modular (mod 23):    7 × 8 = 56 mod 23 = 10

Every operation — addition, subtraction, multiplication —
wraps around. The result is always between 0 and 22.

A natural question: does this wrapping break anything? In regular arithmetic, addition has certain properties we rely on — you can add in any order (commutativity), group additions however you like (associativity), adding zero changes nothing (identity), and every number has a negative (inverse). Modular arithmetic preserves all of these:

WHY MODULAR ARITHMETIC WORKS AS A "NUMBER SYSTEM":
═══════════════════════════════════════════════════
All properties of regular addition still hold (mod 23):

Commutativity:  15 + 20 = 20 + 15 = 12 (mod 23)           ✓
Associativity:  (5 + 8) + 10 = 5 + (8 + 10) = 0 (mod 23)  ✓
Identity:       x + 0 = x for any x                         ✓
Inverse:        15 + 8 = 23 mod 23 = 0  (8 is the "negative" of 15)  ✓
                In mod 23, there are only 23 numbers: 0 through 22.
                -15 doesn't exist separately — it IS 8 (because -15 + 23 = 8).
                31 is also 8 (because 31 - 23 = 8). They're all the same number.
                Every number has exactly one additive inverse, just like in
                regular arithmetic — it's just that the inverse of 15 is
                written as 8 instead of -15.

Same for multiplication:
Commutativity:  3 × 7 = 7 × 3 = 21 mod 23 = 21            ✓
Associativity:  (2 × 3) × 4 = 2 × (3 × 4) = 24 mod 23 = 1  ✓
Identity:       x × 1 = x for any x                         ✓
Inverse:        Every non-zero number has a multiplicative inverse
                (we'll use this in point addition)            ✓

Mathematicians call this a "field" — a number system where
addition, subtraction, multiplication, and division all work.
The integers mod a prime p form a finite field with p elements.

This is important but subtle: the finite field is the number system we use to do arithmetic. It's the toolkit — it tells us that addition, multiplication, and division of numbers mod p all behave correctly. We need this toolkit because the elliptic curve formulas involve all three operations.

But the finite field is not the cryptographic structure itself. The cryptography is built on a different group — the set of points on the curve, with a completely different "addition" operation (a geometric rule we'll define in Section 2.2). The finite field is the language the formulas are written in; the curve points are the objects we actually work with. Think of it this way: the finite field is like the rules of arithmetic you learned in school, and the elliptic curve group is a game built using those rules.

So Bitcoin's curve is actually:

y² ≡ x³ + 7  (mod p)

where p = 2²⁵⁶ - 2³² - 977

That's a 256-bit prime number:
p = 115792089237316195423570985008687907853
    269984665640564039457584007908834671663

Over the real numbers, the curve is a smooth line. Over a finite field, it becomes a scatter of discrete points — every (x, y) pair where x and y are integers between 0 and p-1 and the equation holds. The animation above shows this transition: the smooth curve dissolves into a cloud of seemingly random dots. There are roughly p such points (about 2²⁵⁶).

The set of all these points, together with a rule for "adding" two points, forms a group — a second group, distinct from the finite field of numbers. The finite field is the toolkit for arithmetic. This new group of curve points is the structure that cryptography is built on.

The word "addition" is overloaded. Elliptic curve point addition doesn't mean adding coordinates, and it doesn't mean modular addition of numbers. It's a custom operation defined by a geometric rule. We call it "addition" because it satisfies all the same structural properties (commutativity, associativity, identity, inverses). Let's see how it works.

2.2 Point Addition — The Core Operation

The rule for "adding" two points on an elliptic curve is geometric. Given two points P and Q on the curve, draw a straight line through them. That line will intersect the curve at exactly one other point. Reflect that point across the x-axis. The result is P + Q.

In the animation, notice how the three intersection points (P, Q, and R') are spread across the curve — P on the upper-left branch, Q on the lower-middle, and R' on the lower-right. The line cuts through the curve at exactly three places.

Why exactly three? Because of the algebra. The curve equation is y² = x³ + 7, and a line is y = mx + c. Substituting the line into the curve gives (mx + c)² = x³ + 7, which rearranges to x³ - m²x² - 2mcx - c² + 7 = 0. That's a cubic equation in x — degree 3. Since we started with two real points P and Q on the curve, two of the three roots are real, and for a cubic with real coefficients that forces the third root to be real too (complex roots always come in pairs). So the third intersection point always exists. We don't even need to solve the cubic: Vieta's formulas tell us the sum of all three roots equals m² (the coefficient of x²), so the third root is simply x_R' = m² - x_P - x_Q. That's exactly the formula we used in the worked example.

Now, why does this strange geometric rule deserve to be called "addition"? We claimed it satisfies commutativity (P + Q = Q + P), associativity ((P + Q) + R = P + (Q + R)), has an identity (∞), and every point has an inverse. Let's check each one:

Commutativity (P + Q = Q + P): The line through P and Q is the same line as through Q and P. Same line, same third intersection, same reflection. So the result is identical regardless of order. ✓
Identity (P + ∞ = P): The point at infinity ∞ is defined so that adding it to any point P gives back P, just like adding 0 to any number gives back that number. Geometrically, you can think of ∞ as a point "infinitely far up" on the y-axis — the line from P to ∞ is vertical, and a vertical line through P hits the curve again at -P (the mirror image). Reflecting -P gives P. So P + ∞ = P. ✓
Inverse (P + (-P) = ∞): Every point P = (x, y) has a mirror image -P = (x, -y) directly below it (or above, if P is on the lower branch). The line through P and -P is vertical — it goes straight up and down and never hits the curve at a third point. By convention, this "missing" third intersection is ∞. So P + (-P) = ∞, just like 5 + (-5) = 0. ✓
Associativity ((P + Q) + R = P + (Q + R)): This is the hard one. It's not at all obvious from the geometry — you'd have to draw two different sequences of lines and show they end at the same point. Proving it rigorously requires algebraic geometry beyond our scope. It was established in the 19th century, and we'll take it on faith. ✓

Geometry vs algebra — an important subtlety. The animation above shows a smooth curve with a visible line. But in Section 2.1, we saw that the mod-p curve is a scatter of dots — there's no smooth curve to draw lines on. So how does point addition work on the scatter?

The answer: we use exactly the same formulas. The slope formula, the x-coordinate formula, the y-coordinate formula — all identical whether you're working with real numbers or mod-p numbers. The only difference is that every +, -, ×, and ÷ wraps around mod p.

Why does this work? Because the formulas were derived from the geometry (line through two points → third intersection → reflect), but the derivation only uses addition, subtraction, multiplication, and division. It never uses anything specific to real numbers like "distance," "angle," or "continuity." Since mod-p arithmetic supports all four operations (it's a field, as we showed in Section 2.1), the formulas produce valid results — and the result is guaranteed to be another point on the curve. We'll verify this in the worked example below: (20, 7) satisfies y² = x³ + 7 (mod 23).

The geometric picture gives us intuition for why the formulas look the way they do. The algebra is what actually runs — both in your Bitcoin wallet and in the quantum circuit that tries to break it.

Let's make this concrete with actual numbers. We'll use a tiny curve — y² = x³ + 7 (mod 23) — so we can verify every step by hand. The real Bitcoin curve uses a 256-bit prime, but the arithmetic is identical.

First, we need valid points. A point (x, y) is on the curve if y² mod 23 equals x³ + 7 mod 23. Let's find some:

FINDING POINTS ON y² = x³ + 7 (mod 23):
════════════════════════════════════════
Squares mod 23: {0, 1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18}
(These are the values y² can take. If x³+7 mod 23 isn't
 in this set, there's no point at that x.)

x = 1:  1³ + 7 = 8 mod 23 = 8.   8 is a square (10² = 100 mod 23 = 8).
        y = 10 or y = 23-10 = 13.
        Points: (1, 10) and (1, 13) ✓

x = 10: 10³ + 7 = 1007 mod 23 = 18.  18 is a square (8² = 64 mod 23 = 18).
        y = 8 or y = 23-8 = 15.
        Points: (10, 8) and (10, 15) ✓

Now let's add P = (1, 10) and Q = (10, 8). But first, where do the formulas come from? They follow directly from the geometry we just described:

DERIVING THE POINT ADDITION FORMULAS:
═════════════════════════════════════
The line through P = (x_P, y_P) and Q = (x_Q, y_Q) has slope:
  s = (y_Q - y_P) / (x_Q - x_P)

The line equation is:  y = s×(x - x_P) + y_P

Substituting into the curve y² = x³ + 7:
  [s×(x - x_P) + y_P]² = x³ + 7

Expanding and rearranging gives a cubic in x:
  x³ - s²×x² + ... = 0

By Vieta's formulas, the sum of the three roots equals s²:
  x_P + x_Q + x_R' = s²

So the third intersection point has x-coordinate:
  x_R' = s² - x_P - x_Q

After reflecting across the x-axis (negating y):
  x_R = x_R'                        (x doesn't change)
  y_R = s×(x_P - x_R) - y_P        (from the line equation, negated)

Every formula in the worked example below comes from this derivation. Let's trace it with actual numbers:

POINT ADDITION: P = (1, 10) + Q = (10, 8) on y² = x³ + 7 (mod 23)
═══════════════════════════════════════════════════════════════════

STEP 1: Compute the slope of the line through P and Q.

  s = (y_Q - y_P) × (x_Q - x_P)⁻¹  mod 23

  Numerator:   y_Q - y_P = 8 - 10 = -2 mod 23 = 21
  Denominator: x_Q - x_P = 10 - 1 = 9

  We need 9⁻¹ mod 23 — the modular inverse of 9.
  This is the number that, multiplied by 9, gives 1 mod 23.

    9 × 18 = 162 = 7 × 23 + 1 = 161 + 1  →  162 mod 23 = 1  ✓

  So 9⁻¹ mod 23 = 18.

  Slope: s = 21 × 18 mod 23 = 378 mod 23 = 10
         (378 = 16 × 23 + 10)

STEP 2: Compute x_R (the x-coordinate of the result).

  x_R = s² - x_P - x_Q  mod 23
  x_R = 100 - 1 - 10  mod 23
  x_R = 89 mod 23 = 20
       (89 = 3 × 23 + 20)

STEP 3: Compute y_R (the y-coordinate of the result).

  y_R = s × (x_P - x_R) - y_P  mod 23
  y_R = 10 × (1 - 20) - 10  mod 23
  y_R = 10 × (-19) - 10  mod 23
  y_R = -200 mod 23 = 7
       (-200 + 9 × 23 = -200 + 207 = 7)

RESULT: P + Q = (20, 7)

VERIFY: Is (20, 7) on the curve y² = x³ + 7 (mod 23)?
  Left side:  7² = 49 mod 23 = 3
  Right side: 20³ + 7 = 8007 mod 23 = 3
              (8007 = 348 × 23 + 3)
  3 = 3  ✓  The point is on the curve.

Every step is arithmetic you can check on a calculator. But the modular inverse deserves a closer look, because it's the key operation that makes "division" work in modular arithmetic.

In regular arithmetic, dividing by 9 means multiplying by 1/9. The number 1/9 is the multiplicative inverse of 9 — the number that, when multiplied by 9, gives 1. In modular arithmetic, we need the same thing: a number that, when multiplied by 9, gives 1 mod 23.

MODULAR INVERSE — "DIVISION" IN MOD ARITHMETIC:
════════════════════════════════════════════════
In regular arithmetic:
  9 × (1/9) = 1
  So 1/9 is the inverse of 9.

In mod 23 arithmetic:
  We need x such that 9 × x ≡ 1 (mod 23).
  Try x = 18:  9 × 18 = 162 = 7 × 23 + 1  →  162 mod 23 = 1  ✓
  So 18 is the inverse of 9 (mod 23).

  "Dividing by 9" in mod 23 means "multiplying by 18."

Why does every non-zero number have an inverse mod a prime?
  Because p = 23 is prime. For any a not divisible by p,
  Fermat's little theorem guarantees:
    a^(p-1) ≡ 1 (mod p)
  So:
    a × a^(p-2) ≡ 1 (mod p)
  The inverse of a is a^(p-2) mod p. It always exists.

  For 9 mod 23:  9^21 mod 23 = 18  ✓  (same answer)

This is why we need a prime modulus. Let's see what goes wrong if p isn't prime. Say we try mod 12 instead of mod 23:

WHY THE MODULUS MUST BE PRIME:
══════════════════════════════
Mod 23 (prime): every non-zero number has an inverse.
  9⁻¹ mod 23 = 18   (because 9 × 18 = 162 = 7×23 + 1)  ✓

Mod 12 (not prime, 12 = 2 × 2 × 3): some numbers have NO inverse.
  Try to find 6⁻¹ mod 12 — a number x where 6 × x ≡ 1 (mod 12):
    6 × 1 = 6     6 × 2 = 12 ≡ 0    6 × 3 = 18 ≡ 6
    6 × 4 = 24 ≡ 0    6 × 5 = 30 ≡ 6    6 × 6 = 36 ≡ 0
    ...every result is 0 or 6. Never 1.

  6 has no inverse mod 12. "Dividing by 6" is impossible.

Why? Because 6 and 12 share a common factor (6 = 2×3, 12 = 2²×3).
When a and p share a factor, a × x cycles through a subset of
values mod p and never hits 1. With a prime p, no number 1..p-1
shares a factor with p, so every number has an inverse.

The point addition formula requires dividing by (x_Q - x_P). If the modulus isn't prime, that difference could be a number with no inverse, and the formula would break. A prime modulus guarantees this never happens.

There's one special case: point doubling, where P = Q. You can't draw a line through a single point, so instead you use the tangent line to the curve at P. The slope of the tangent is dy/dx, which we get by implicitly differentiating the curve equation y² = x³ + ax + b:

DERIVING THE DOUBLING SLOPE:
════════════════════════════
Start with:       y² = x³ + ax + b
Differentiate:    2y × (dy/dx) = 3x² + a
Solve for slope:  dy/dx = (3x² + a) / (2y)

For secp256k1 (a = 0):
  dy/dx = 3x² / (2y)

In mod-p arithmetic, this becomes s = (3x² + a) × (2y)⁻¹ mod p, where a is the coefficient from the curve equation y² = x³ + ax + b. For secp256k1, a = 0, so it simplifies to s = 3x² × (2y)⁻¹ mod p. The division by 2y is a modular inverse, just like in regular point addition. The rest of the calculation (x_R, y_R) uses the same formulas as before.

One more case: if P = (x, y), then -P = (x, p - y). Adding P and -P gives the point at infinity — the identity element of the group, analogous to zero in regular addition.

Why this matters: Point addition is the atomic operation of elliptic curve cryptography. Everything — key generation, signing, verification, and the quantum attack — is built from repeated point additions. The formulas above are the same ones running inside every Bitcoin wallet, every TLS handshake, and every quantum circuit that tries to break them.

2.3 Scalar Multiplication — The Trapdoor

Now we get to the operation that makes cryptography possible.

Scalar multiplication means performing the point addition operation from Section 2.2 repeatedly: k · P = P + P + P + … + P (k times). The number k is called the scalar, and it will become the private key. Each "+" here is not ordinary number addition — it's the full elliptic curve point addition (compute a slope via modular inverse, find the new x-coordinate, find the new y-coordinate). Each one involves several modular multiplications on 256-bit numbers. A single point addition is fast (microseconds), but doing it trillions of times is not.

The high-level intuition: imagine you're standing on the curve at point P. Each point addition jumps you to a new point on the curve. After k jumps, you land on some point Q. The curve has no visible pattern — each jump looks random, even though it's completely deterministic. If someone shows you the starting point P and the ending point Q, you can't tell how many jumps it took. That's the trapdoor: jumping forward is easy (just keep adding), but figuring out the number of jumps from the endpoints is astronomically hard.

The naive approach — performing k-1 point additions one after another — would require as many point additions as the value of k itself. How big is k? It's a 256-bit number, meaning it's stored in 256 binary digits. The smallest 256-bit number is 2²⁵⁵ (a 1 followed by 255 zeros in binary), and the largest is 2²⁵⁶ - 1 (256 ones in binary). So k can be as large as ~2²⁵⁶ — a number with 77 decimal digits. Doing 2²⁵⁶ sequential point additions (each one a multi-step modular arithmetic computation) is completely infeasible; even at a billion point additions per second, it would take longer than the age of the universe.

But there's a much faster method called double-and-add. The key insight relies on the group properties we established in Section 2.2: because elliptic curve point addition is associative and commutative, we can regroup additions however we want. Adding P to itself 13 times can be regrouped as:

WHY REGROUPING WORKS:
═════════════════════
13 · P means: P + P + P + P + P + P + P + P + P + P + P + P + P
              (13 copies of P, added using the geometric rule)

Because point addition is associative, we can regroup:
  = (P+P+P+P+P+P+P+P) + (P+P+P+P) + P
  = 8P + 4P + P

This is just like 13 = 8 + 4 + 1 in regular arithmetic.
It works because the GROUP PROPERTIES (associativity,
commutativity) hold for point addition — even though
the "addition" is a geometric operation, not coordinate addition.

And 8P is easy to compute: just double three times.
  P → 2P → 4P → 8P  (3 point doublings)

So: 13P = 8P + 4P + P, computed with 3 doublings + 2 additions
instead of 12 sequential additions.

The double-and-add algorithm automates this. It processes k's binary representation bit by bit. "Doubling" means adding a point to itself (P + P = 2P) using the tangent-line formula from above — this is just point addition where both inputs are the same point. The notation "2 × P" means "P + P" (two point additions), "6 × P" means "P + P + P + P + P + P" (but computed efficiently), and so on.

DOUBLE-AND-ADD: Computing 13 · P
═════════════════════════════════
13 in binary: 1 1 0 1

We process from the most significant bit to the least.
Start with result = ∞ (point at infinity = identity).

Bit 1 (leftmost):
  Double: ∞ + ∞ = ∞       (adding identity to itself gives identity,
                            just like 0 + 0 = 0)
  Bit is 1, so add P: ∞ + P = P   (identity doesn't change P)
  result = P

Bit 1:
  Double: P + P = 2P       (tangent line at P → third intersection → reflect)
  Bit is 1, so add P: 2P + P = 3P
  result = 3P

Bit 0:
  Double: 3P + 3P = 6P     (tangent line at 3P → new point)
  Bit is 0, so skip the add.
  result = 6P

Bit 1 (rightmost):
  Double: 6P + 6P = 12P
  Bit is 1, so add P: 12P + P = 13P
  result = 13P  ✓

Operations: 4 doublings + 3 additions = 7 total.
Naive method would need 12 additions.

Why does this work? Let's trace what the "result" variable holds after each step, and see how it builds up 13:

TRACING THE ACCUMULATOR:
════════════════════════

Step                        Binary    What happened
────                        ──────    ─────────────
Start:                      0         (empty — point at infinity)
Bit 1: double then add  →   1         doubled 0→0, added 1→1
Bit 1: double then add  →   11        doubled 1→10, added 1→11 (= 3)
Bit 0: double only      →   110       doubled 11→110 (= 6), bit is 0 so no add
Bit 1: double then add  →   1101      doubled 110→1100, added 1→1101 (= 13)

The pattern:
  "double" shifts the binary left (appends a 0).
  "add P" changes that trailing 0 to a 1.

  double:  11   → 110    (appended 0)
  add:     110  → 111    ← would happen if bit were 1,
                            but bit was 0 so we skip

  double:  110  → 1100   (appended 0)
  add:     1100 → 1101   (changed trailing 0 to 1)

The algorithm reads k's binary digits left to right,
reconstructing k one bit at a time:
  1 → 11 → 110 → 1101 = 13  ✓

For a 256-bit scalar k, the binary representation has 256 bits. Each bit requires one doubling, and roughly half the bits are 1 (requiring an additional addition). So:

COST OF SCALAR MULTIPLICATION (256-bit k):
══════════════════════════════════════════
Doublings:  256 (one per bit)
Additions:  ~128 (one per '1' bit, roughly half)
Total:      ~384 point operations

Each point operation involves a modular inverse, a couple of
multiplications, a squaring, and several subtractions.
On modern hardware: ~1 millisecond for the entire thing.

The reduction: from 2²⁵⁶ operations (impossible)
               to 384 operations (instant).
That's the power of binary decomposition.

384 operations. That's it. From a 256-bit number and a starting point, we can compute k · P in under a millisecond.

Now here's the trapdoor. Given P and the result Q = k · P, can you find k?

THE ONE-WAY PROPERTY:
═════════════════════
FORWARD (computing Q = k · P):
  Input:  k (256-bit number), P (point on curve)
  Method: double-and-add
  Cost:   ~384 point operations
  Time:   ~1 millisecond

REVERSE (finding k given P and Q):
  Input:  P (point on curve), Q = k · P (point on curve)
  Best classical algorithm: Pollard's rho
  Cost:   ~2¹²⁸ point operations (square root of the group order)

  How long is that?
    2¹²⁸ = 3.4 × 10³⁸ operations
    At 10⁹ operations/second (aggressive estimate):
      3.4 × 10³⁸ / 10⁹ = 3.4 × 10²⁹ seconds
      = 1.08 × 10²² years

    Age of the universe: 1.38 × 10¹⁰ years
    Ratio: ~800 billion times the age of the universe

    Even with a million parallel machines:
      10²² / 10⁶ = 10¹⁶ years
      Still 780,000× the age of the universe.

This asymmetry — forward is trivial, reverse is impossible — is the Elliptic Curve Discrete Logarithm Problem (ECDLP). The best classical attack is Pollard's rho algorithm, which runs in O(√n) time where n is the group order (roughly 2²⁵⁶ for secp256k1). That gives ~2¹²⁸ operations — far beyond what any classical computer can do.

This is the foundation of all elliptic curve cryptography. The private key is k. The public key is Q = k · P. Everyone can see Q and P (they're public). Nobody can find k. The security of Bitcoin, Ethereum, TLS, SSH, and Signal all rest on this single assumption.

Or rather, they rest on this assumption holding for classical computers.

2.4 How Bitcoin Uses This

Let's make this concrete. Here's how Bitcoin turns the ECDLP into a payment system.

Every Bitcoin user generates a key pair:

KEY GENERATION:
═══════════════
1. Pick a random 256-bit number k.
   This is your PRIVATE KEY. Keep it secret.

2. Compute Q = k · G, where G is a fixed "generator point"
   that's part of the secp256k1 specification.
   This is your PUBLIC KEY.

   G is publicly known — it's the same for everyone.
   Q is unique to you (because k is unique to you).

3. Your Bitcoin ADDRESS is a hash of Q:
   address = RIPEMD160(SHA256(Q))

   RIPEMD160 and SHA256 are hash functions — they take any
   input and produce a fixed-size fingerprint (160 bits and
   256 bits respectively). The key property: given the hash,
   you can't recover the original input. So the address hides
   your public key Q behind two layers of hashing.

   This means: even if someone knows your address, they can't
   get your public key Q — until you spend from the address.

When you want to send Bitcoin, you create a digital signature using your private key k. The signature proves you own the funds without revealing k. Anyone can verify the signature using your public key Q — but they can't extract k from Q (because ECDLP is hard).

Let's trace a simplified version of how ECDSA (the Elliptic Curve Digital Signature Algorithm) works. But before diving into the formulas, let's understand what a digital signature is actually trying to do.

The problem: you want to prove to the world that you — the person who knows private key k — approved a specific message ("send 1 BTC to Alice"). The proof must satisfy three constraints: (1) anyone can verify it using only your public key Q, (2) it doesn't reveal k, and (3) it can't be reused for a different message.

The naive approach would be to just reveal k. The verifier checks Q = k · G and is convinced. But now everyone knows your private key — useless.

A better idea: instead of proving you know the discrete log of Q directly, prove you know it indirectly. Pick a random secret number r, compute a new point R = r · G (you know the discrete log of R because you chose r), and then produce a value s that could only be computed by someone who knows both r and k. The verifier can check s using only public information (Q, R, and the message hash), but can't extract k from it because r is unknown to them and discarded after signing.

That's the core idea. The specific formula for s is designed so that the verification equation works out to something computable from public data alone. Let's see how (RareSkills has an excellent full derivation if you want to see how the formula is discovered step by step):

ECDSA SIGNING — STEP 1: HASH THE MESSAGE
═════════════════════════════════════════
You want to sign a message m (e.g., "send 1 BTC to Alice").

  h = SHA256(m)    → a 256-bit number

This binds the signature to this specific message.

ECDSA SIGNING — STEP 2: PICK A RANDOM NONCE
════════════════════════════════════════════
Pick a random nonce r (a fresh random 256-bit number).
   (Standard ECDSA notation uses k for the nonce, but we
    use r to avoid confusion with the private key k.)
Compute R = r · G  (scalar · point = a point on the curve).
Take the x-coordinate of R. Call it R_x.

R acts as a public "commitment" to the nonce r.
Everyone can see R, but they can't find r from R
(that would require solving the ECDLP on r).

Why do we need r when we already have the private key k? Imagine signing without a nonce — just publishing s = h × k mod n. This is regular modular multiplication, NOT elliptic curve point multiplication. And unlike EC point multiplication (where finding k from Q = k·G is the hard ECDLP), modular number multiplication is easily reversible — we showed how in Section 2.2. The attacker knows s (it's public) and h (they can compute SHA256 of the message). So k = s × h⁻¹ mod n. One modular inverse (trivial to compute), and your private key is exposed.

The nonce r fixes this by masking k inside the formula. With r in the mix, the attacker sees s, h, and R_x, but can't solve for k without also knowing r. And r is random, used once, and discarded. "Nonce" literally means "number used once" — and it MUST be different every time. If you reuse a nonce for two different messages, the attacker can eliminate r from the two equations and solve for k. (In 2010, Sony reused a nonce in the PS3's ECDSA implementation. Hackers extracted the private key and broke PS3 code signing permanently.)

ECDSA SIGNING — STEP 3: COMPUTE THE SIGNATURE VALUE
════════════════════════════════════════════════════
  s = r⁻¹ × (h + R_x × k)  mod n

  where n ≈ 2²⁵⁶ is the "order" of the curve (total number
  of points in the group). We work mod n because scalar
  multiplication wraps around after n additions.

Why this specific formula? The goal is to create a value that depends on k and h, can be verified using only public information, and doesn't reveal k. Working backwards from what the verifier needs:

DERIVING THE FORMULA FROM THE VERIFICATION REQUIREMENT:
═══════════════════════════════════════════════════════
The verifier knows h, R_x, s, G, and Q. They need to
reconstruct R = r · G from these public values.

                                  number×number  scalar·point
                                  ─────────────  ────────────
  r · G = s⁻¹×(h + R_x×k) · G          (rearranging s formula)
        = (s⁻¹×h) · G + (s⁻¹×R_x) · (k·G)  (distributing)
        = (s⁻¹×h) · G + (s⁻¹×R_x) · Q       (Q = k·G)

That last line uses only public values (h, s, R_x, G, Q).
The formula for s is designed so this works out.

ECDSA SIGNING — STEP 4: PUBLISH
═══════════════════════════════
The signature is the pair (R_x, s).
Publish (R_x, s) along with the message m.
Your private key k is buried inside s, but extracting
it requires knowing r — which you discard after signing.

ECDSA VERIFICATION (what the network checks):
══════════════════════════════════════════════
Given: message m, signature (R_x, s), public key Q.

1. Compute the hash: h = SHA256(m)

2. Compute:
   u₁ = h × s⁻¹ mod n       (message part)
   u₂ = R_x × s⁻¹ mod n     (commitment part)

3. Compute the verification point:
   V = u₁ · G + u₂ · Q

4. Check: does V's x-coordinate equal R_x?
   If yes → signature is valid.
   If no  → signature is forged or corrupted.

Why does this work? Let's trace the algebra:

WHY VERIFICATION WORKS:
═══════════════════════
  V = u₁ · G + u₂ · Q                              (scalar · point)
    = (h/s) · G + (R_x/s) · Q                       (substituting u₁, u₂)
    = (h/s) · G + (R_x/s) · (k·G)                   (substituting Q = k·G)
    = ((h + R_x×k) / s) · G                          (factoring out G)
    = ((h + R_x×k) / (r⁻¹×(h + R_x×k))) · G        (substituting s)
    = r · G                                           (cancels!)
    = R  ✓

The verification uses only the public key Q — it never needs the private key k. But critically, the verification is done against a specific Q. On the Bitcoin network, each unspent output (each "coin") is locked to a specific address, which is derived from a specific public key. When someone broadcasts a transaction spending those coins, the network checks the signature against the public key that owns those coins, not just any public key.

So if John tries to sign a message "send Bob's coins to Alice" using John's private key, the math would check out against John's public key Q_john — but the network would verify against Bob's public key Q_bob (the one tied to the coins). The pairing equation would fail because the signature was made with the wrong key. It's like forging someone's signature on a check — the bank compares it to the account holder's signature on file, not the forger's.

This is why the quantum attack matters: the attacker needs Bob's specific private key k_bob to sign a valid transaction spending Bob's coins. And to get k_bob, they need to solve Q_bob = k_bob × G — the ECDLP for Bob's public key.

Here's where the vulnerability appears. To verify the signature, the network needs Bob's public key Q. So when Bob broadcasts a transaction, Q is included in the transaction data and becomes visible to everyone on the network.

BITCOIN TRANSACTION LIFECYCLE:
══════════════════════════════
t = 0 seconds:
  You sign a transaction with your private key k.
  The transaction includes your public key Q.
  You broadcast it to the Bitcoin network.

t = 0 to ~600 seconds (0-10 minutes):
  The transaction sits in the MEMPOOL — a public waiting area.
  Every node on the network can see your public key Q.

  *** VULNERABLE WINDOW ***
  Anyone who can solve Q = k · G for k can sign a
  competing transaction stealing your funds.

t ≈ 600 seconds:
  A miner includes your transaction in a block.
  The block is added to the blockchain.
  Your transaction is now CONFIRMED.
  Too late for an attacker — the funds have moved.

The average Bitcoin block time is about 10 minutes. That's the window. If an attacker can solve the ECDLP — compute k from Q — in less than 10 minutes, they can steal the funds before the transaction confirms.

The Google paper claims this can be done in 9 minutes.

We'll derive the exact attack probability in Part 2, Section 5.3 (coming soon). For now, the key point is: the security of every Bitcoin transaction rests on the ECDLP being unsolvable within a 10-minute window. The paper says a quantum computer could solve it in 9.

3. Shor's Algorithm for ECDLP — The Quantum Attack

We now know what we're breaking: the ECDLP, the problem of finding k given P and Q = k · P. Classically, this takes ~2¹²⁸ operations. Quantum mechanically, Shor's algorithm reduces it to something tractable. But how many quantum operations does "tractable" actually mean?

3.1 How Shor's Algorithm Works — The Intuition

We won't derive Shor's algorithm step by step here — that's coming in Zero to Quantum Part 4, where we'll build it from the Quantum Fourier Transform up. But you deserve more than "it's a black box." Let's build a real intuition for why a quantum computer can solve the ECDLP, and what it actually does.

The core idea behind Shor's algorithm is period finding. It turns out that many hard problems in cryptography — integer factoring, discrete logarithms, and the ECDLP — can be reduced to finding a hidden period in a mathematical function. And finding periods is something quantum computers are extraordinarily good at.

Let's build this up with an analogy. Imagine you have a function that takes a number as input and produces a color as output. The function is periodic — the colors repeat in a cycle — but you don't know the period. Classically, you'd have to evaluate the function at many different inputs and look for the repeat. If the period is large, this takes a long time.

A quantum computer does something fundamentally different. It prepares a superposition of all possible inputs and evaluates the function on this superposition in a single pass. But — and this is crucial — the speedup does NOT come from "trying all inputs at once and reading the results." You can't read them all out; measurement would collapse the superposition and give you just one random input-color pair. Instead, the quantum computer applies the Quantum Fourier Transform (QFT), which converts the superposition from the "input domain" to the "frequency domain." The hidden period shows up as a sharp peak in the frequency domain. Measure, and you get the period. The power is in the interference pattern created by the QFT, not in parallel evaluation.

SHOR'S ALGORITHM — THE CORE IDEA:
══════════════════════════════════
1. SUPERPOSITION: Prepare a quantum register in a superposition
   of all possible inputs (0, 1, 2, ..., N-1) simultaneously.

2. EVALUATE: Apply the function f(x) to this superposition.
   The quantum computer computes f(x) for ALL x at once.
   Result: a quantum state encoding all (x, f(x)) pairs.

3. QUANTUM FOURIER TRANSFORM: Apply the QFT to the input register.
   This converts the state from the "position" basis to the
   "frequency" basis. If f has period r, the QFT produces
   sharp peaks at multiples of N/r.

4. MEASURE: Read out the frequency register.
   You get a value close to some multiple of N/r.
   Classical post-processing (continued fractions) extracts r.

The magic is in steps 2-3 together: the quantum computer
evaluates f on a superposition of all inputs, then the QFT
extracts the hidden period through interference. The speedup
doesn't come from "trying all inputs at once" (you can't read
them all out — measurement gives just one random result).
It comes from the QFT converting the periodic structure of
the superposition into a measurable frequency peak.

Now, how does this apply to the ECDLP? We want to find k such that Q = k · P. Here's the connection:

SHOR'S FOR ECDLP — THE REDUCTION:
══════════════════════════════════
Define a function on two variables (a, b):

  f(a, b) = a · P + b · Q

where P is the generator and Q = k · P is the public key.

This function has a hidden structure:
  f(a, b) = f(a + k, b - 1)

because:
  (a + k) × P + (b - 1) × Q
  = a·P + k·P + b·Q - Q
  = a·P + Q + b·Q - Q        (since k·P = Q)
  = a·P + b·Q
  = f(a, b)

So the function is periodic with period (k, -1).
Finding this period reveals k — the private key.

Shor's algorithm:
  1. Superpose all (a, b) pairs
     Each of a and b ranges from 0 to ~2²⁵⁶, so each needs
     a 256-bit quantum register. These two registers are put
     into superposition of all possible values simultaneously.
  2. Evaluate f(a, b) = a·P + b·Q in superposition
  3. Apply QFT to both registers
  4. Measure → extract the period (k, -1) → recover k

Step 2 is where all the cost lives. The quantum computer must compute a·P + b·Q — which is two scalar multiplications and one point addition — for a superposition of all possible (a, b) values. This means performing elliptic curve arithmetic on 256-bit numbers, entirely within a quantum circuit, without any intermediate measurements.

The key word is coherently. Every intermediate value must remain in quantum superposition — no peeking at partial results. The entire scalar multiplication must be compiled into a sequence of quantum gates that runs start to finish without measurement. The question becomes: how many quantum gates does that require?

What Shor's gives us, and what the paper contributes: Shor's algorithm tells us what computation to perform — scalar multiplication in superposition, followed by a QFT. The QFT itself is cheap (polynomial in the number of qubits). The expensive part is the scalar multiplication. The paper's contribution is figuring out how to perform that multiplication efficiently — the specific circuit, the gate count, the qubit layout. Everything from here forward is circuit engineering, not quantum algorithm theory. We'll derive the full QFT and period-finding machinery in Zero to Quantum.

3.2 Windowed Arithmetic — From 384 Operations to 28

In Section 2.3, we saw that classical scalar multiplication uses the double-and-add method: 256 doublings + ~128 additions = ~384 point operations. The quantum circuit could do the same thing, but there's a much better approach.

The idea is windowed multiplication. Instead of processing k one bit at a time, we process it in chunks. Let's see how this works with a small example first, then scale up to 256 bits.

EXAMPLE: COMPUTING 203 × P WITH WINDOW SIZE w = 4
══════════════════════════════════════════════════
203 in binary: 1100 1011  (8 bits)

DOUBLE-AND-ADD (bit by bit):
  Process one bit at a time.
  8 doublings + 5 additions = 13 operations.

WINDOWED (4-bit chunks):
  Split into two 4-bit chunks:
    High chunk: 1100 = 12 (decimal)
    Low chunk:  1011 = 11 (decimal)

  203 = 12 × 2⁴ + 11 = 12 × 16 + 11

  So: 203 × P = 12 × 16 × P + 11 × P
             = 12 × (16P) + 11 × P

  If we precompute tables of multiples, we just need
  two lookups and one addition!

The key insight: instead of processing 203 bit by bit (13 operations), we split it into two 4-bit chunks and look up the answer for each chunk in a precomputed table. But we need the tables to account for the position of each chunk — the high chunk represents 12 × 16, not just 12. So we build position-shifted tables:

And here's why this is practical: P is the generator point G from the secp256k1 specification — a fixed, publicly known constant. It's the same for every Bitcoin user, published as part of the standard. So all these table entries (multiples of G, multiples of 16G, etc.) can be computed classically in advance — they're just repeated point additions on known values. The resulting tables are then loaded into the quantum circuit as QROM (Quantum Read-Only Memory), so the quantum computer can look up entries in superposition during Shor's algorithm.

POSITION-SHIFTED LOOKUP TABLES:
═══════════════════════════════
TABLE 0 (for the LOW chunk, position 0):
  Stores multiples of P (the base point).
  Table0[0]  = ∞ (identity)
  Table0[1]  = 1 × P
  Table0[2]  = 2 × P
  ...
  Table0[11] = 11 × P  ← we need this one
  ...
  Table0[15] = 15 × P

TABLE 1 (for the HIGH chunk, position 1):
  Stores multiples of 2⁴ × P = 16P.
  Table1[0]  = ∞
  Table1[1]  = 1 × 16P = 16P
  Table1[2]  = 2 × 16P = 32P
  ...
  Table1[12] = 12 × 16P = 192P  ← we need this one
  ...
  Table1[15] = 15 × 16P = 240P

Each table has 2⁴ = 16 entries (one for each possible
4-bit value). The tables are precomputed once.

THE COMPUTATION:
  Look up low chunk (11) in Table 0:  → 11P
  Look up high chunk (12) in Table 1: → 192P
  Add them: 11P + 192P = 203P  ✓

  Operations: 2 table lookups + 1 point addition.
  Compare to double-and-add: 13 operations.
  That's a 6.5× reduction!

Now scale this up to 256 bits with window size w = 16:

SCALING TO 256 BITS (window size w = 16):
═════════════════════════════════════════
Split k (256 bits) into 256/16 = 16 chunks of 16 bits each.

Build 16 position-shifted tables:
  Table 0:  multiples of P           (2¹⁶ = 65,536 entries)
  Table 1:  multiples of 2¹⁶ × P    (65,536 entries)
  Table 2:  multiples of 2³² × P    (65,536 entries)
  ...
  Table 15: multiples of 2²⁴⁰ × P   (65,536 entries)

The computation:
  Look up each chunk in its corresponding table → 16 lookups
  Add all 16 results together                   → 15 additions

  Total: 16 lookups + 15 point additions.
  Compare to double-and-add: ~384 operations.
  That's a ~25× reduction in point additions!

The catch: in a quantum circuit, a "table lookup" isn't free. It's a quantum operation called QROM (Quantum Read-Only Memory) that must be implemented in gates. But QROM is much cheaper than a full point addition, so the tradeoff is worth it. The total number of expensive point additions drops from ~384 to ~15.

THE PAPER'S OPTIMIZED COUNT:
════════════════════════════
Window size:              w = 16 (each chunk is 16 bits wide)
Number of windows:        256 / 16 = 16 windows
                          (coincidentally the same number!)

Per scalar multiplication:
  Point additions:        ~15 (one per window, minus one)
  QROM table lookups:     16 (one per window — cheaper than
                          point additions, but not free)

Shor's algorithm for ECDLP requires TWO interleaved scalar
multiplications (one for each register — see Section 3.1).

Total quantum point additions: 28 (paper's stated number)
  Our estimate: 15 per multiplication × 2 = 30 (close but
  not exact — the precise count depends on circuit details).
Total QROM lookups: ~32
  (16 per multiplication × 2)

The 28 point additions are the expensive operations.
The 32 QROM lookups are cheaper per-operation but still
contribute to the total gate count.

Compare to naive: ~384 point additions → 28 additions + 32 lookups.
The most expensive operations are reduced by ~14×.

The paper states 28 total quantum point additions for the low-qubit circuit variant. Our rough estimate (15 per multiplication × 2 = 30) is close but not exact — the precise count depends on circuit-level details the paper does not fully publish.

3.3 Cost Per Point Addition

Each quantum point addition on 256-bit numbers is a substantial circuit. To see why, let's trace what a single modular multiplication looks like in quantum gates, starting small.

In a quantum circuit, the basic "expensive" gate is the Toffoli gate — a 3-qubit gate that flips the third qubit if and only if the first two are both 1. When the third qubit starts at 0, this computes a AND b — but unlike a classical AND gate, the Toffoli is reversible (no information is destroyed). This reversibility is required for quantum computation, and it's precisely what makes the gate expensive to implement with error correction. (CNOT gates, which implement XOR, are essentially free in the surface code — more on that in Part 2 (coming soon).)

A 2-BIT MODULAR MULTIPLICATION IN QUANTUM GATES:
═════════════════════════════════════════════════
Multiply a = a₁a₀ by b = b₁b₀ (both 2-bit numbers).

Long multiplication, same as you learned in school:

              a₁    a₀
           ×  b₁    b₀
           ───────────
             a₁∧b₀  a₀∧b₀         ← a × b₀
      a₁∧b₁  a₀∧b₁  0             ← a × b₁, shifted left
  +   ────────────────────
      c₃      c₂    c₁    c₀      ← 4-bit result

  ∧ means AND — single-bit multiplication.
  In a quantum circuit, each AND is one Toffoli gate.

  Step 1 — Partial products:
    4 AND operations: a₁∧b₀, a₀∧b₀, a₁∧b₁, a₀∧b₁
    = 4 Toffoli gates

  Step 2 — Adding the two rows:
    c₀ = a₀∧b₀                          (no addition needed)
    c₁ = (a₁∧b₀) XOR (a₀∧b₁)           (XOR = CNOT = free)
    carry = (a₁∧b₀) AND (a₀∧b₁)        (1 Toffoli for the carry)
    c₂ = (a₁∧b₁) XOR carry              (free)
    c₃ = (a₁∧b₁) AND carry              (1 Toffoli)
    = 2 Toffoli gates for carries

  Step 3 — Modular reduction (if result ≥ p, subtract p):
    Compare 4-bit result to p: ~1 Toffoli
    Conditionally subtract p:  ~1 Toffoli
    = ~2 Toffoli gates

  TOTAL: 4 + 2 + 2 = 8 Toffoli gates for 2-bit × 2-bit mod p.

For 2 bits, we can count every gate. For larger bit widths, the exact count depends on the specific adder architecture (ripple-carry, carry-lookahead, etc.), but the scaling is clear: the partial products alone cost n² Toffoli gates (each bit of a ANDed with each bit of b), and the carry propagation adds a similar amount. The total grows as O(n²) — quadratic in the bit width:

SCALING TO 256 BITS:
════════════════════
The dominant cost is O(n²) from partial products + carries.

  2-bit:    ~8 Toffoli        (counted exactly above)
  256-bit:  ~130,000 Toffoli  (from the literature — the exact
            count depends on the multiplication algorithm used;
            schoolbook gives ~2n² ≈ 131,000, but optimized
            methods like Karatsuba can reduce this somewhat)

The key point: one 256-bit modular multiplication costs
roughly 100,000+ Toffoli gates. And a single point addition
needs MANY of these multiplications.

And a single point addition from Section 2.2 requires multiple modular multiplications:

OPERATIONS IN ONE POINT ADDITION:
═════════════════════════════════
From the formulas in Section 2.2:
  Slope:  s = (y₂-y₁) × (x₂-x₁)⁻¹   → 1 inversion + 1 multiplication
  x_R:    s² - x₁ - x₂                → 1 squaring (= multiplication)
  y_R:    s×(x₁ - x_R) - y₁           → 1 multiplication

  Subtotal: 3 multiplications + 1 inversion.

  The inversion (computing (x₂-x₁)⁻¹ mod p) is the expensive part.

  In Section 2.2, we showed that a⁻¹ mod p = a^(p-2) mod p
  (Fermat's little theorem). So computing an inverse means
  computing a^(p-2) — raising a number to a huge power.

  How do you compute a^(p-2) efficiently? The same trick as
  double-and-add from Section 2.3, but for exponentiation:
  instead of "double the point, add P," it's "square the
  number, multiply by a."

  Small example: compute a^13 (13 in binary = 1101)

    Start: result = 1
    Bit 1: square → 1,   multiply by a → a         (= a¹)
    Bit 1: square → a²,  multiply by a → a³        (= a³)
    Bit 0: square → a⁶,  skip                      (= a⁶)
    Bit 1: square → a¹², multiply by a → a¹³       (= a¹³) ✓

    4 squarings + 3 multiplications = 7 operations.
    Naive (multiply a × a × a... 12 times) = 12 operations.

  p-2 is a 256-bit number, so:
    Squarings:        256 (one per bit of the exponent)
    Multiplications:  ~128 (one per '1' bit, roughly half)
    Total for one inversion: ~384 modular multiplications

  Now add up everything for one point addition:
    Slope (s):     1 inversion (384 mults) + 1 multiplication = 385
    x_R:           1 squaring (= 1 multiplication)            =   1
    y_R:           1 multiplication                            =   1
                                                                ───
    Total: ~387 modular multiplications per point addition

  (We round to ~390 to account for a few extra operations
   in coordinate handling.)

  At ~130,000 Toffoli gates per multiplication:
  Naive cost: 390 × 130,000 ≈ 51 million Toffoli gates per addition!

The gap between naive and optimized: Our naive estimate gives ~50 million Toffoli gates per point addition. The paper's circuit achieves ~90 million total for all 28 additions — averaging ~3.2 million per addition, roughly 15× better. The paper doesn't publish the specific optimizations (they're protected by the ZK proof discussed in Part 2, Section 6 (coming soon)). What we can say is where the naive estimate is wasteful and where improvements are possible in principle:

Our naive estimate computes each modular inverse separately using Fermat's little theorem (~384 multiplications per inversion). In practice, multiple inversions can be batched — if you need a⁻¹ and b⁻¹, you can compute (a×b)⁻¹ with one inversion, then recover a⁻¹ = b×(a×b)⁻¹ and b⁻¹ = a×(a×b)⁻¹. This turns N inversions into 1 inversion + 3(N-1) multiplications — a huge saving when N is large.
Our naive estimate uses schoolbook multiplication (~130,000 Toffoli per multiply). More efficient algorithms exist (Karatsuba, Montgomery) that reduce this, especially for the specific structure of the secp256k1 prime.
Our naive estimate treats each point addition independently. In practice, consecutive additions share intermediate values that can be reused rather than recomputed.

The exact combination of these (and likely other) techniques is the paper's unpublished contribution. The paper states the total circuit cost: ~90 million Toffoli gates on ~1,200 logical qubits. It does not publish a per-addition breakdown, so we cannot verify exactly how the 90M is distributed across the 28 additions and overhead operations. What we can say is that 90M gates across 28 additions averages to ~3.2M per addition — dramatically less than our naive ~50M estimate, confirming that substantial optimizations are at work.

COST PER QUANTUM POINT ADDITION (paper's numbers):
═══════════════════════════════════════════════════
Naive estimate:            ~50 million Toffoli gates
Paper's total for all 28:  ~90 million Toffoli gates
Average per addition:      ~90M / 28 ≈ 3.2 million
  (this average includes all overhead — QROM lookups,
   coordinate conversions, etc. — not just the point
   addition arithmetic itself)
Improvement over naive:    ~15× (from ~50M to ~3.2M per addition)

Why only count Toffoli gates?
  Clifford gates (H, S, CNOT) are essentially free compared to Toffoli in the surface code.
  They can be implemented transversally — directly on the
  encoded qubits with negligible overhead.

  Non-Clifford gates (T gates, Toffoli gates) are expensive.
  They require "magic states" produced by dedicated hardware.
  The Toffoli count is what determines the physical cost.

LOGICAL QUBIT COUNT: WHERE ~1,200 COMES FROM
═════════════════════════════════════════════
Logical qubits = how many quantum bits are in use simultaneously.
Think of it as the "width" of the circuit — how many wires
are active at the same time.

The main components we can identify:

1. INPUT REGISTERS (for Shor's algorithm):
   Two 256-bit quantum registers for the (a, b) values
   in f(a,b) = a·P + b·Q (from Section 3.1).
   These are in SUPERPOSITION — all 2²⁵⁶ values at once.
   = 2 × 256 = 512 qubits

2. WORKING REGISTERS (for the arithmetic):
   The circuit computes a·P + b·Q, which involves point
   additions and modular multiplications. The intermediate
   point coordinates (x, y — each 256 bits) must be stored
   in quantum registers because they're entangled with the
   superposed (a, b) inputs.
   = 2 × 256 = 512 qubits

   Note: P and Q themselves are NOT quantum. They're fixed
   classical values (P = generator, Q = public key) loaded
   into the circuit as constants. Only the values that DEPEND
   on the superposed inputs need quantum registers.

3. SCRATCH SPACE + ANCILLA QUBITS:
   Modular multiplication needs temporary storage for
   intermediate values. Also, quantum circuits must be
   reversible — intermediate values can't be overwritten,
   they must be "uncomputed" after use.
   = the remaining ~176 qubits (1,200 - 512 - 512)

TOTAL: ~1,200 logical qubits (paper's stated number)

The paper states the total but does not publish the internal
breakdown. The 512 + 512 for input and working registers is
our estimate based on the data widths; the remainder (~176)
covers scratch space, ancilla, and any other overhead.

(The low-gate variant uses ~1,450 logical qubits and ~70 million
Toffoli gates — more qubits but fewer gates. The tradeoff:
extra scratch space allows the circuit to avoid recomputing
intermediate values, reducing the gate count at the cost of
more qubits.)

Why Clifford gates are free: The surface code (which we'll cover in Part 2, Section 4 (coming soon)) can implement Clifford gates — Hadamard, S, and CNOT — by rearranging the logical qubit boundaries on the physical hardware. They still take time, but they don't require the expensive magic state distillation that Toffoli gates need. Non-Clifford gates like Toffoli require injecting a special quantum state (a "magic state") that must be prepared separately. This is why the paper counts Toffoli gates: they're the bottleneck.

3.4 The Full Circuit: 90 Million Gates on 1,200 Qubits

Let's put together what we know about the total gate count:

TOTAL TOFFOLI GATE COUNT:
═════════════════════════

POINT ADDITIONS (the dominant cost):
  From Section 3.2: the windowed approach reduces the number
  of point additions from ~384 to 28 (the paper's stated count
  for the low-qubit circuit variant).

  From Section 3.3: a naive point addition costs ~50 million
  Toffoli gates. The paper's optimized circuit reduces this
  to an average of ~3.2 million per addition (90M total / 28
  additions). The specific optimizations are not published.

THE PAPER'S STATED TOTAL:
  ~90,000,000 Toffoli gates on ~1,200 logical qubits.

  This 90M includes everything: the 28 point additions,
  the QROM table lookups, coordinate conversions, and all
  modular arithmetic. The paper does not publish a breakdown
  of how the 90M is distributed across these components.

90 million Toffoli gates on ~1,200 logical qubits. These two numbers together define the quantum circuit. Let's connect them.

The ~1,200 qubits are the width of the circuit — how many quantum wires exist simultaneously. We estimated the breakdown in Section 3.3: ~512 qubits for the two input registers (a and b in superposition), ~512 for the working point coordinates, and the remainder for scratch space and reversibility overhead. These 1,200 qubits persist throughout the entire computation.

The 90 million Toffoli gates are the depth — how many operations are applied to those wires. Each Toffoli gate picks 3 of the 1,200 qubits to operate on, then the next gate picks a (possibly different) 3, and so on, 90 million times. The qubits are reused across gates, not consumed. Think of it like a factory with 1,200 workstations: each step involves 3 workstations, but the factory itself doesn't grow — the same workstations are used over and over.

The challenge isn't the number of operations — a classical computer could do 90 million multiplications in a fraction of a second. The challenge is that all 1,200 qubits must maintain their quantum state coherently across all 90 million steps, without any qubit losing its information to noise. A single error at step 50 million could corrupt the final answer. That's why we need error correction — the subject of Part 2.

The circuit details are partially redacted. The paper does not publish the full gate decomposition for elliptic curve point addition. The specific optimizations that reduce the total to 90 million Toffoli gates are part of the paper's contribution, verified via a zero-knowledge proof (Part 2, Section 6 (coming soon)) rather than published in full. We cannot independently reproduce the per-addition circuit or verify the internal breakdown.

Part 1 Summary

Let's take stock. Starting from the raw curve equation, we traced an unbroken chain to the headline number:

Step	What we showed	Key number
The curve	y² = x³ + 7 over a finite field (mod p)	p is 256 bits
Point addition	Geometric rule → algebraic formulas, verified by hand	(1,10) + (10,8) = (20,7)
Scalar multiplication	Double-and-add reduces k additions to log₂(k)	~384 naive operations
The trapdoor	Forward is fast, reverse is impossible classically	~1ms vs ~10²² years
Bitcoin & ECDSA	Private key k, public key Q = k · G, 10-min window	~6.9M BTC exposed
Shor's algorithm	Period-finding on f(a,b) = a·P + b·Q	Two 256-bit registers
Windowed arithmetic	Process k in 16-bit chunks with lookup tables	384 → 28 additions
Gate cost	~130K Toffoli per multiply, ~390 multiplies per addition	~50M naive per addition
Paper's result	Optimized circuit (details unpublished)	~90M Toffoli total, ~1,200 logical qubits

We now know what the quantum computer needs to do (90 million Toffoli gates on 1,200 logical qubits) but not yet how much hardware that requires in the real world, or how long it takes.

Continue Reading

In Part 2 (coming soon), we derive the physical qubit count (why 1,200 logical qubits become 500,000 physical), the 9-minute runtime, why the attack succeeds only ~41% of the time (it must finish before the next Bitcoin block is mined — a race against a random Poisson clock), the zero-knowledge proof mechanism, the hardware reality check, and the post-quantum migration path.

Part 2: From 90 Million Gates to 9 Minutes — coming soon

Cite this article

@misc{fofadiya2026ecdlp,
  author       = {Darshan Fofadiya},
  title        = {How 500,000 Qubits Could Break Bitcoin (Part 1): Elliptic Curves and the Quantum Attack},
  year         = {2026},
  url          = {https://darshanfofadiya.com/research-papers/google-ecdlp/}
}

Get notified when I publish new deep dives

Illustrated guides on LLM inference, quantum computing, and AI research papers.

⚠ After subscribing, check your inbox for a confirmation email. You won't receive posts until you confirm.

How 500,000 Qubits Could Break Bitcoin

1. The Headline: 9 Minutes to Break Bitcoin

1.1 What Google Published

1.2 What's at Stake

1.3 What This Article Derives

2. Elliptic Curve Cryptography — What We're Breaking

2.1 What Is an Elliptic Curve?

2.2 Point Addition — The Core Operation

2.3 Scalar Multiplication — The Trapdoor

2.4 How Bitcoin Uses This

3. Shor's Algorithm for ECDLP — The Quantum Attack

3.1 How Shor's Algorithm Works — The Intuition

3.2 Windowed Arithmetic — From 384 Operations to 28

3.3 Cost Per Point Addition

3.4 The Full Circuit: 90 Million Gates on 1,200 Qubits

Part 1 Summary

Continue Reading

Cite this article

Comments