From the simplest proof of quantum advantage to the algorithms that will break cryptography and simulate nature. Step-by-step explanations with the real math.
The Quantum Algorithm Landscape
Quantum algorithms exploit superposition, entanglement, and interference to solve problems faster than any known classical algorithm. They fall into several families based on the technique used.
Family
Technique
Speedup
Key algorithms
Hidden subgroup
Quantum Fourier Transform
Exponential
Shor, Simon, Deutsch-Jozsa
Amplitude amplification
Grover diffusion
Quadratic
Grover, quantum counting
Variational / hybrid
Classical optimization loop
Heuristic (unproven)
VQE, QAOA, QML
Hamiltonian simulation
Trotter, qubitization
Exponential (for quantum systems)
Product formulas, QSP
Linear algebra
Phase estimation + amplitude encoding
Exponential (with caveats)
HHL, QLSP
Exponential speedup means the quantum algorithm runs in time polynomial in N where the best known classical algorithm requires exponential time. This does not mean every problem gets a speedup. For NP-complete problems, quantum computers offer at most a quadratic speedup (Grover), not exponential. The quantum advantage is problem-specific.
Deutsch-Jozsa Algorithm
The simplest demonstration of quantum advantage. Given a black-box function f: {0,1}ⁿ → {0,1} that is promised to be either constant (all outputs same) or balanced (exactly half 0s and half 1s), determine which one with certainty.
Classical complexity
A deterministic classical algorithm requires 2ⁿ⁻¹ + 1 queries in the worst case (check just over half the inputs). A probabilistic algorithm needs O(1) queries for high confidence but cannot achieve certainty.
Quantum complexity: 1 query
Initialize: Prepare n+1 qubits: |0⟩⊗ⁿ |1⟩
Hadamard all qubits: Apply H⊗ⁿ⁺¹. The first n qubits become uniform superposition over all 2ⁿ inputs. The last qubit becomes |-⟩.
Query the oracle: Apply U_f, which maps |x⟩|y⟩ → |x⟩|y ⊕ f(x)⟩. Due to the |-⟩ ancilla, this introduces a phase: |x⟩ → (-1)^f(x) |x⟩
Hadamard the first n qubits again.
Measure the first n qubits. If all zeros → f is constant. Otherwise → f is balanced.
Why it works
After step 3, the state of the first n qubits is:
|ψ⟩ = (1/2^(n/2)) ∑_x (-1)^f(x) |x⟩
Apply H⊗n:
|ψ'⟩ = (1/2^n) ∑_y [ ∑_x (-1)^(f(x) + x·y) ] |y⟩
Amplitude of |0...0⟩ = (1/2^n) ∑_x (-1)^f(x)
If f is constant: all (-1)^f(x) are the same → amplitude = ±1
If f is balanced: half are +1, half are -1 → amplitude = 0
Given a black-box function f(x) = s·x (mod 2) for some hidden binary string s, find s using a single query. Classically, n queries are needed (one per bit of s).
The algorithm
1. Prepare |0⟩⊗n |1⟩
2. Apply H to all qubits
3. Query oracle: U_f|x⟩|y⟩ = |x⟩|y ⊕ s·x⟩
Phase kickback: |x⟩ → (-1)^(s·x) |x⟩
4. Apply H⊗n to first n qubits
5. Measure → get s directly
Why: H⊗n transforms (-1)^(s·x)|x⟩ back to |s⟩.
This is exactly the Fourier transform of the function (-1)^(s·x).
Speedup
Approach
Queries
Type
Classical deterministic
n
Query each standard basis vector e_i
Classical randomized
n
Cannot do better (information-theoretic)
Quantum
1
Exponential speedup in query complexity
Bernstein-Vazirani is structurally identical to Deutsch-Jozsa. The difference is the promise: DJ asks "constant or balanced?", BV asks "what is the hidden string?" Both exploit the same Hadamard-oracle-Hadamard structure.
Given a black-box function f: {0,1}ⁿ → {0,1}ⁿ with the promise that f(x) = f(y) iff x ⊕ y ∈ {0, s} for some hidden string s, find s. This is the hidden period problem for the group (Z/2Z)ⁿ.
Significance
Simon's algorithm (1994) provided the first provable exponential quantum speedup over any classical algorithm (not just deterministic, but also randomized). It directly inspired Peter Shor to develop his factoring algorithm — Shor's algorithm is essentially Simon's algorithm generalized from (Z/2Z)ⁿ to Z/NZ.
The algorithm
Repeat O(n) times:
1. Prepare |0⟩⊗n |0⟩⊗n
2. Apply H⊗n to first register
3. Query oracle: |x⟩|0⟩ → |x⟩|f(x)⟩
4. Measure second register (collapse first register to |x⟩ + |x⊕s⟩)
5. Apply H⊗n to first register
6. Measure first register → get some y where y·s = 0
After n-1 linearly independent y values:
Solve the system y₁·s = 0, y₂·s = 0, ..., y_(n-1)·s = 0
→ unique solution s (by Gaussian elimination over GF(2))
Complexity comparison
Approach
Queries
Total time
Classical (best possible)
Ω(2^(n/2))
Exponential
Quantum (Simon's)
O(n)
O(n³) including classical post-processing
Reference:Simon (1997), originally presented at FOCS 1994
Shor's Algorithm
Shor's algorithm (1994) factors integers in polynomial time, breaking RSA and other public-key cryptosystems that rely on the hardness of factoring. It is the most consequential quantum algorithm and the primary motivation for building large-scale quantum computers.
The problem
Given an integer N, find its prime factors. The best known classical algorithm (general number field sieve) runs in sub-exponential time exp(O(n¹∕³(log n)²∕³)) where n = log N. Shor's algorithm runs in O(n² log n log log n) — polynomial.
Key insight: factoring reduces to period finding
To factor N:
1. Pick random a where 1 < a < N, gcd(a, N) = 1
2. Find the period r of f(x) = a^x mod N
(i.e., smallest r > 0 such that a^r ≡ 1 mod N)
3. If r is even: compute gcd(a^(r/2) ± 1, N) → factors of N
The quantum part solves step 2 (period finding) exponentially
faster than any known classical method.
Step-by-step walkthrough
Choose parameters: Pick random a coprime to N. If gcd(a,N) > 1, you already found a factor (lucky). Set q = 2ⁿ where 2ⁿ is between N² and 2N².
Initialize registers: Two quantum registers: |0⟩⊗ⁿ (input, 2n qubits) and |0⟩⊗ⁿ (output, n qubits).
Create superposition: Apply H⊗²ⁿ to input register: (1/√q) ∑_{x=0}^{q-1} |x⟩|0⟩
Compute modular exponentiation: Apply U_f: |x⟩|0⟩ → |x⟩|aⁿ mod N⟩. Uses repeated squaring — O(n³) elementary gates. This is the most expensive part of the circuit.
Apply Quantum Fourier Transform to the input register. This maps the periodic signal in the amplitudes to peaks at multiples of q/r.
Measure the input register. Get some value c ≈ j·q/r for some integer j. Use the continued fractions algorithm to extract r from c/q.
Classical post-processing: If r is even and a^(r/2) ≢ -1 (mod N), compute gcd(a^(r/2) ± 1, N). With probability ≥ 1/2, this gives a non-trivial factor. Repeat if needed.
Example: factoring 15
N = 15, pick a = 7
f(x) = 7^x mod 15:
7^0=1, 7^1=7, 7^2=4, 7^3=13, 7^4=1, 7^5=7, ...
Period r = 4
r is even, a^(r/2) = 7² = 49, 49 mod 15 = 4 ≠ -1 ✓
gcd(49 - 1, 15) = gcd(48, 15) = 3
gcd(49 + 1, 15) = gcd(50, 15) = 5
Factors: 3 × 5 = 15 ✓
Grover's algorithm searches an unstructured database of N items for a marked item in O(√N) queries, compared to O(N) classically. This quadratic speedup is provably optimal for unstructured search.
The problem
Given an oracle function f: {0,...,N-1} → {0,1} where f(x) = 1 for exactly one target x = w (the "marked item"), find w.
The algorithm
Initialize: Start with |0⟩⊗ⁿ. Apply H⊗ⁿ to create uniform superposition: |s⟩ = (1/√N) ∑_x |x⟩
Repeat ⌊π√N/4⌋ times (the "Grover iteration"):
Oracle: Apply O_f that flips the phase of the marked item: |x⟩ → (-1)^f(x) |x⟩
Diffusion: Apply the Grover diffusion operator D = 2|s⟩⟨s| - I (reflect about the mean amplitude)
Measure. Get w with probability close to 1.
Amplitude amplification: the math
Define the 2D subspace spanned by |w⟩ (target) and |s'⟩ (uniform
superposition of all non-target states).
Initial state: |s⟩ = sin(θ)|w⟩ + cos(θ)|s'⟩
where sin(θ) = 1/√N, so θ ≈ 1/√N for large N.
Each Grover iteration is a rotation by 2θ in this 2D plane:
After k iterations: sin((2k+1)θ)|w⟩ + cos((2k+1)θ)|s'⟩
Maximum probability at |w⟩ when (2k+1)θ ≈ π/2:
k ≈ π/(4θ) ≈ (π/4)√N
If you iterate too many times, the amplitude rotates PAST
the target and success probability drops. This is critical:
you must know (approximately) how many items are marked.
Multiple marked items
If M items are marked out of N:
Optimal iterations: k = ⌊(π/4)√(N/M)⌋
Success probability ≥ 1 - M/N
For unknown M: use quantum counting (QPE on Grover operator)
to estimate M first, then run Grover with correct iteration count.
Applications beyond search
Amplitude amplification: Generalization that boosts success probability of any quantum algorithm from p to ~1 using O(1/√p) repetitions instead of O(1/p).
SAT solving: Search for satisfying assignments — quadratic speedup over brute force.
Minimum finding: Find the minimum of an unordered list in O(√N). (Durr-Hoyer algorithm)
Given a unitary U and an eigenstate |u⟩ with U|u⟩ = e^(2πiφ)|u⟩, estimate the phase φ to n bits of precision using O(n) controlled-U operations. QPE is the core subroutine of Shor's algorithm, HHL, and quantum chemistry simulations.
The circuit
Input: |0⟩⊗n |u⟩ (n counting qubits + eigenstate register)
1. Apply H⊗n to counting register
2. Apply controlled-U^(2^k) from counting qubit k to eigenstate register
for k = 0, 1, ..., n-1
3. Apply inverse QFT (QFT†) to counting register
4. Measure counting register → n-bit approximation of φ
After step 2, counting register contains:
(1/√2^n) ∑_k e^(2πi·φ·k) |k⟩
This is exactly the Fourier transform of |φ⟩ (in the frequency domain).
Inverse QFT maps it back to |φ⟩ ≈ |φ₁φ₂...φn⟩ in binary.
Precision and success probability
With n counting qubits:
φ is estimated to n bits of precision
Success probability ≥ 1 - 1/(2(2^n - 2)) ≈ 1 - ε
To get ε failure probability with t-bit precision:
Use n = t + ⌈log(2 + 1/(2ε))⌉ counting qubits
The expensive part: controlled-U^(2^k) requires exponentially
many applications of U unless U has efficient power structure.
Applications
Shor's algorithm: QPE with U = modular multiplication finds the period of a^x mod N.
Quantum chemistry: QPE with U = e^(-iHt) gives eigenvalues (energy levels) of molecular Hamiltonians.
HHL algorithm: QPE extracts eigenvalues of a matrix for solving linear systems.
Quantum counting: QPE on the Grover operator estimates the number of marked items.
VQE is a hybrid quantum-classical algorithm for finding the ground state energy of a Hamiltonian. It is the most promising near-term algorithm for quantum chemistry and materials science, designed to work on noisy (NISQ) hardware.
The variational principle
For any trial state |ψ(θ)⟩:
E(θ) = ⟨ψ(θ)|H|ψ(θ)⟩ ≥ E_0
where E_0 is the true ground state energy.
VQE minimizes E(θ) over parameters θ to find the best
approximation to E_0 using a parameterized quantum circuit.
The algorithm
Choose an ansatz: A parameterized quantum circuit U(θ) that prepares trial states |ψ(θ)⟩ = U(θ)|0⟩. Common choices: UCCSD (chemistry-inspired), hardware-efficient ansatz, ADAPT-VQE.
Prepare the state: Run U(θ) on the quantum computer for current parameters θ.
Measure the energy: Decompose H into a sum of Pauli strings: H = ∑_i c_i P_i. Measure each P_i independently (multiple circuit runs) and compute E(θ) = ∑_i c_i ⟨P_i⟩.
Classical optimization: Feed E(θ) to a classical optimizer (COBYLA, SPSA, L-BFGS-B). Get new parameters θ'.
Repeat steps 2-4 until convergence: |E(θ_k) - E(θ_{k-1})| < threshold.
Ansatz circuits
Hardware-efficient ansatz (example for 4 qubits):
Layer: Ry(θ_i) on each qubit → CNOT ladder → Ry(θ_j) → CNOT ladder
Repeat for d layers. Total parameters: O(n·d)
UCCSD (Unitary Coupled Cluster):
U(θ) = exp(T(θ) - T(θ)†)
T = T₁ + T₂ (single + double excitations)
Chemically motivated, but deep circuits (many CNOTs)
Challenges
Barren plateaus: For random ansatze, gradients vanish exponentially with qubit count, making optimization intractable.
Measurement overhead: Each Pauli string requires O(1/ε²) shots for ε precision. With O(N⁴) Pauli strings for molecular Hamiltonians, total measurement count is enormous.
Local minima: The energy landscape is non-convex. Classical optimizers can get stuck.
Noise: NISQ hardware noise biases energy estimates upward. Error mitigation helps but adds overhead.
QAOA is a hybrid quantum-classical algorithm for combinatorial optimization problems. It encodes the objective function as a quantum Hamiltonian and uses alternating layers of problem and mixer unitaries to find approximate solutions.
MaxCut example
The canonical QAOA problem: given a graph G = (V, E), partition vertices into two sets to maximize the number of edges between them.
Classical objective: C = ∑_{(i,j) ∈ E} (1 - z_i z_j) / 2
where z_i ∈ {+1, -1}
Quantum Hamiltonian: H_C = ∑_{(i,j) ∈ E} (I - Z_i Z_j) / 2
Goal: find bitstring z that maximizes ⟨z|H_C|z⟩
The QAOA circuit
For depth p, with parameters γ = (γ₁,...,γ_p) and β = (β₁,...,β_p):
|ψ(γ,β)⟩ = U_B(β_p) U_C(γ_p) ... U_B(β₁) U_C(γ₁) |+⟩⊗n
where:
U_C(γ) = e^(-iγ H_C) = ∏_{(i,j)∈E} e^(-iγ(I-Z_iZ_j)/2)
(problem unitary: encodes the objective)
U_B(β) = e^(-iβ H_B) = ∏_i e^(-iβ X_i)
(mixer unitary: explores solution space)
Each U_C layer: one Rzz(-γ) gate per edge
Each U_B layer: one Rx(-2β) gate per vertex
Algorithm flow
Start in |+⟩⊗n (uniform superposition over all bitstrings).
Apply p layers of alternating U_C(γ_k) and U_B(β_k).
Measure all qubits to get a candidate solution bitstring.
Evaluate the cut value classically.
Repeat with different (γ, β) using a classical optimizer to maximize ⟨H_C⟩.
Performance
Depth p
Approximation ratio (3-regular MaxCut)
Circuit depth
p = 1
0.6924 (vs 0.5 random)
O(|E|)
p = 2
~0.7559
O(2|E|)
p → ∞
→ 1 (exact)
O(p|E|)
Classical best (GW)
0.878 (polynomial time)
N/A
Whether QAOA at any fixed depth p can outperform the best classical algorithms for MaxCut remains an open question. At p=1, it is provably worse than the Goemans-Williamson SDP relaxation (0.878 ratio). The hope is that moderate-depth QAOA on real hardware, perhaps with problem-specific modifications, finds practical speedups for specific instances — but this is not yet demonstrated.
The HHL algorithm (Harrow, Hassidim, Lloyd, 2009) solves the linear system Ax = b exponentially faster than classical methods — under specific conditions. It maps the solution |x⟩ ∝ A⁻¹|b⟩ onto a quantum state.
The problem
Given: N×N matrix A (Hermitian, s-sparse), vector b
Find: x = A⁻¹b
Classical: O(N·s·κ log(1/ε)) using conjugate gradient
where κ = condition number, ε = precision
HHL: O(s² κ² log(N) / ε) → exponential speedup in N!
Caveats (big ones):
- b must be efficiently loadable as a quantum state |b⟩
- You only get |x⟩ as a quantum state (not the classical vector)
- Useful only if you want a property of x (e.g., ⟨x|M|x⟩)
not the full vector (reading out N amplitudes costs O(N))
- A must be sparse and well-conditioned (small κ)
How it works
State preparation: Encode b as a quantum state |b⟩ = ∑_i b_i |i⟩. This can be the hardest step — general state preparation requires O(N) gates, destroying the exponential speedup.
QPE: Decompose |b⟩ in the eigenbasis of A: |b⟩ = ∑_j β_j |u_j⟩. Use Quantum Phase Estimation to extract eigenvalues λ_j into an ancilla register: ∑_j β_j |λ_j⟩|u_j⟩.
Controlled rotation: Apply a rotation conditioned on |λ_j⟩ to an ancilla qubit: rotate by angle arcsin(C/λ_j). The ancilla amplitude proportional to 1/λ_j encodes the inverse.
Inverse QPE: Uncompute the eigenvalue register.
Post-select: Measure the ancilla qubit. If it reads |1⟩, the remaining state is proportional to ∑_j (β_j/λ_j)|u_j⟩ = A⁻¹|b⟩. Success probability is O(1/κ²), amplifiable with amplitude amplification to O(1/κ).
Practical relevance
Quantum machine learning: Solving linear regressions, least-squares problems, and support vector machines in log(N) time.
Finite element methods: Solving PDEs (computational fluid dynamics, structural analysis) if the matrix is sparse and well-conditioned.
Portfolio optimization: Solving constrained optimization problems that reduce to linear systems.
Reality check: The state preparation and readout bottlenecks severely limit practical advantage. Recent dequantization results (Tang, 2018) show that for low-rank problems, classical algorithms can achieve similar log(N) scaling.