Fast Scrambling and Hayden–Preskill Recovery

The Page curve is a statement about entropy. It says that if black hole evaporation is unitary, then the fine-grained entropy of the radiation cannot keep increasing forever. But the Page curve by itself does not tell us how information comes out, when a newly injected message becomes recoverable, or what kind of observer could recover it.

The Hayden–Preskill thought experiment answers a sharper question. Suppose a black hole is already old, so that it is highly entangled with the radiation it emitted earlier. Alice throws a small quantum system into the black hole. Bob has collected the early radiation and continues to collect the newly emitted Hawking quanta. How long must Bob wait before Alice’s quantum state is recoverable from the radiation?

The striking answer is that an old black hole behaves like a quantum information mirror. After the message has been scrambled, Bob needs to collect only slightly more radiation entropy than the entropy of the message. The delay is not the Page time. It is approximately the scrambling time,

t_*\sim {\beta\over 2\pi}\log S_{\rm BH},

plus the time required to radiate the message-size number of quanta.

This page explains what that statement means, why it depends on the black hole being old, and why it does not by itself imply that decoding Hawking radiation is easy.

Guiding question

Why should an old black hole return newly thrown-in quantum information quickly, while information thrown into a young black hole remains hidden until near the Page time?

The answer has two ingredients.

First, an old black hole is already entangled with a large external system: the early radiation. Second, black hole dynamics is expected to scramble information rapidly. Scrambling makes the message inaccessible to any small part of the remaining black hole, while the pre-existing entanglement with the early radiation allows the message to be reconstructed from early radiation plus a small amount of late radiation.

The Hayden-Preskill protocol — The Hayden–Preskill protocol. A message system $A$ , purified by a reference $Q$ , is thrown into an old black hole $B$ that is already entangled with early radiation $E$ . The black hole dynamics is modeled by a scrambling unitary $U_{AB}$ . After the output is divided into late radiation $D$ and the remaining black hole $C$ , Bob tries to recover the message from $E\cup D$ . Recovery is possible when $Q$ is decoupled from $C$ .

Thermalization is not scrambling

A black hole is an efficient thermalizer: low-point observables of matter thrown into the horizon relax quickly. But thermalization and scrambling are not the same.

Thermalization means that simple observables forget detailed initial data. For example, a perturbation of a thermal state may decay on a timescale

t_{\rm diss}\sim \beta,

where

\beta={1\over T_H}

is the inverse Hawking temperature.

Scrambling means that initially localized quantum information has spread over so many degrees of freedom that no small subsystem contains a useful record of it. Equivalently, the information is hidden in high-order correlations among many degrees of freedom. Scrambling is therefore a statement about information and entanglement, not merely about expectation values of a few simple operators.

A useful diagnostic is the growth of an out-of-time-order commutator. For two simple operators $V$ and $W$ , define schematically

C(t)=\left\langle [W(t),V(0)]^\dagger [W(t),V(0)]\right\rangle_\beta.

In a large- $N$ chaotic system with a holographic black hole dual, one often finds an early exponential regime

C(t)\sim {1\over S_{\rm BH}}e^{\lambda_L t},

until $C(t)$ becomes order one. The coefficient $1/S_{\rm BH}$ reflects the fact that one simple perturbation initially affects only a small fraction of the black hole degrees of freedom. The Lyapunov exponent is bounded by

\lambda_L\leq {2\pi\over \beta},

and Einstein gravity black holes saturate this bound in the classical low-energy regime. Setting $C(t_*)\sim 1$ gives

t_*\sim {1\over \lambda_L}\log S_{\rm BH} \sim {\beta\over 2\pi}\log S_{\rm BH}

for a maximally chaotic black hole.

The important hierarchy for a macroscopic evaporating black hole is therefore

t_{\rm diss}\ll t_*\ll t_{\rm Page}.

The first inequality says that local equilibration happens before full scrambling. The second says that scrambling is fast compared with evaporation.

The hierarchy of black hole timescales — The hierarchy of timescales for a large evaporating black hole. Local relaxation occurs on a thermal timescale $t_{\rm diss}\sim \beta$ . Scrambling occurs on $t_*\sim (\beta/2\pi)\log S_{\rm BH}$ for a maximally chaotic black hole. The Page time is parametrically longer: for an evaporating Schwarzschild black hole, $t_{\rm Page}$ is of order the evaporation time, while $t_*/t_{\rm Page}\sim (\log S_{\rm BH})/S_{\rm BH}$ up to constants.

The Hayden–Preskill setup

The thought experiment uses a clean quantum-information model of an old black hole.

Let $A$ be the message system Alice throws in. To track whether the message is preserved, purify it by an external reference system $Q$ :

|\psi\rangle_{QA}\in \mathcal H_Q\otimes \mathcal H_A, \qquad \dim \mathcal H_Q=\dim \mathcal H_A=d_A.

The reference $Q$ never falls into the black hole. It is a bookkeeper. If Bob can reconstruct $A$ , then Bob can restore the entanglement between $A$ and $Q$ .

Let $B$ denote the old black hole before the message enters, and let $E$ denote the early Hawking radiation. The old-black-hole assumption is that $B$ is highly entangled with $E$ . In the idealized model one takes

|\Phi\rangle_{BE}

to be nearly maximally entangled on the relevant code subspace.

The message and black hole then undergo a unitary scrambling dynamics

U_{AB}:\mathcal H_A\otimes\mathcal H_B \longrightarrow \mathcal H_C\otimes\mathcal H_D,

where $D$ is the newly emitted Hawking radiation and $C$ is the remaining black hole. The total state is pure on

Q\otimes C\otimes D\otimes E.

Bob has access to $E$ and $D$ . He wants a recovery map

\mathcal R_{ED\to A'}

such that $A'$ is entangled with $Q$ in the same way that $A$ originally was.

This is a precise way to ask whether Alice’s quantum state has come out in the Hawking radiation.

The decoupling criterion

The central principle is simple:

Bob can recover the message from $E\cup D$ when the reference $Q$ is decoupled from the remaining black hole $C$ .

Mathematically, the decoupling condition is

\rho_{QC}\simeq \rho_Q\otimes \rho_C.

Equivalently,

I(Q:C)\simeq 0,

where

I(X:Y)=S(X)+S(Y)-S(XY)

is mutual information.

Why is this the right condition? Since the full system $QCDE$ is pure,

I(Q:DE)=2S(Q)-I(Q:C).

Thus if $I(Q:C)\simeq 0$ , then

I(Q:DE)\simeq 2S(Q).

All the entanglement of the reference $Q$ is carried by $E\cup D$ . By Uhlmann’s theorem, or equivalently by the decoupling theorem, there exists a recovery operation on $E\cup D$ that reconstructs a system $A'$ purified by $Q$ .

Decoupling and Hayden-Preskill recovery — Decoupling is the information-theoretic core of Hayden–Preskill recovery. If the reference $Q$ has negligible mutual information with the remaining black hole $C$ , then in the pure state on $Q C D E$ the purifier of $Q$ must lie in $D\cup E$ . The message is not locally visible in a single Hawking quantum; it is encoded in correlations of the radiation.

Why the old black hole acts like a mirror

The dimension counting is the heart of the result.

For a Haar-random scrambling unitary $U_{AB}$ , the expected decoupling error in the old-black-hole limit scales schematically as

\mathbb E_U\left\|\rho_{QC}-\rho_Q\otimes\rho_C\right\|_1 \lesssim {d_A\over d_D},

up to constants and small corrections associated with imperfect maximal entanglement between $B$ and $E$ . Therefore Bob can recover the message with error at most $\epsilon$ once

d_D\gtrsim {d_A\over \epsilon}.

For a message of $k$ qubits,

d_A=2^k.

If the late radiation contains $m$ qubits,

d_D=2^m.

The condition becomes

m\gtrsim k+\log_2{1\over \epsilon}.

So an old black hole returns a $k$ -qubit message after it has radiated only slightly more than $k$ qubits, provided the message has first been scrambled.

This does not mean that the message bounces off the horizon like a classical wave. The message first becomes delocalized in the black hole’s microscopic degrees of freedom. Because the black hole is already entangled with the early radiation, the new outgoing radiation can serve as the final missing share needed for reconstruction.

A compact slogan is:

\boxed{ \text{old black hole} + \text{scrambling} + \text{early radiation} \quad\Longrightarrow\quad \text{rapid recoverability} }

The retention time is therefore approximately

t_{\rm ret}\sim t_*+t_{\rm emit}(k),

where $t_{\rm emit}(k)$ is the time required to emit of order $k$ qubits of radiation entropy. For a small message thrown into a large black hole, $t_{\rm emit}(k)$ is usually much shorter than the Page time.

Why young black holes are different

Now remove the early radiation. If $E$ is absent or too small, Bob cannot use a large entangled partner of the black hole as part of his decoder. In that case, a small amount of late radiation $D$ is not enough. The purifier of the reference $Q$ remains mostly in the remaining black hole $C$ .

This is the same physics as the Page curve. For a young black hole, the radiation Hilbert space is still the smaller subsystem. A typical unitary evaporation process makes the radiation almost maximally mixed, and information about the initial state remains inaccessible from the radiation alone. Only after the Page time does the accumulated radiation have enough Hilbert-space capacity and enough entanglement structure to purify new outgoing quanta.

Thus the Hayden–Preskill result is not in tension with the Page curve. It refines it:

information thrown in before the black hole is old remains hidden until roughly the Page time;
information thrown into an old black hole can be recovered after a scrambling-time delay, assuming access to the early radiation.

The word “old” is doing real work.

Channel capacity viewpoint

The Hayden–Preskill model can be viewed as a noisy quantum channel. The input is the message $A$ , the channel is the black hole unitary followed by the split into $C$ and $D$ , and the decoder is allowed to use the side information $E$ .

Without side information, the radiation channel may have almost no capacity at early times. With side information, the channel can have nearly optimal capacity once the black hole is old. The early radiation $E$ is not merely a pile of previous quanta; it is an entangled resource.

This is why the result is often described using quantum error correction. The black hole dynamics encodes Alice’s message into a larger system. The remaining black hole $C$ is an erased subsystem. Bob succeeds if the encoded message is correctable after losing $C$ , using $D\cup E$ as the surviving physical system.

The decoupling condition

I(Q:C)\simeq 0

is exactly the condition that the erased subsystem $C$ contains no information about the logical input.

What Bob must actually do

Hayden–Preskill recovery is an information-theoretic statement. It says that a recovery map exists under idealized assumptions. Bob’s job is not easy.

Bob must:

collect the early Hawking radiation coherently;
continue collecting the appropriate late Hawking radiation;
know the relevant black hole dynamics well enough to construct a decoder;
perform a highly nonlocal quantum operation on the radiation.

The last two requirements are severe. A theorem can say that a message is encoded in the radiation without giving an efficient algorithm for extracting it.

This distinction becomes crucial in the firewall discussion. Harlow and Hayden argued that the decoding operations needed for certain AMPS-style experiments are generically so computationally hard that they cannot be completed before the black hole evaporates. That argument does not deny that the information is present in the radiation. It separates information-theoretic recoverability from efficient recoverability.

There are also idealized models in which explicit decoders can be built. For example, decoders inspired by Yoshida and Kitaev use time reversal, auxiliary entanglement, teleportation-like circuits, and scrambling diagnostics. These constructions are extremely useful as models of the protocol, but they should not be confused with a practical recipe for decoding the Hawking radiation of an astrophysical black hole.

Relation to complementarity and no-cloning

At first sight the mirror behavior sounds dangerous. If Alice jumps in carrying a qubit and Bob later reconstructs that qubit from the radiation, has quantum mechanics produced two copies?

The careful answer is no, at least not in the operational sense used by black hole complementarity. Bob’s reconstruction requires waiting outside, collecting radiation, and applying a nonlocal decoder. Alice’s interior measurement is made along an infalling worldline. No single observer can both verify Alice’s interior copy and complete Bob’s exterior decoding in the simple semiclassical regime where the thought experiment is formulated.

The Hayden–Preskill result nevertheless sharpens the tension. It shows that after the Page time, unitarity wants late radiation to be strongly correlated with early radiation. Semiclassical effective field theory near the horizon wants each outgoing Hawking mode to be entangled with an interior partner. The next page, on complementarity and firewalls, turns this tension into the AMPS monogamy argument.

Relation to islands and entanglement wedges

In modern language, Hayden–Preskill recovery is an ancestor of the island story.

After the Page time, the radiation is not merely an external record of emitted quanta. In holographic and semiclassical island calculations, the entanglement wedge of the radiation can include an island inside the gravitating region. Operators in that island are reconstructible from the radiation in the same broad sense that Hayden–Preskill says a thrown-in message is reconstructible from $E\cup D$ .

The two languages emphasize different aspects:

Hayden–Preskill language	Island / entanglement-wedge language
Random unitary scrambling	Semiclassical geometry plus QES extremization
Early radiation as side information	Radiation entanglement wedge includes an island
Decoupling from remaining hole $C$	Reconstruction from the radiation wedge
Recovery after $t_*$ for old holes	Island appears after the Page transition

Neither language says that the message is locally visible in an individual Hawking quantum. The information is encoded nonlocally, and recovering it requires the correct global operation.

A useful four-line summary

For a young black hole:

\text{small }D\quad\Rightarrow\quad I(Q:D)\simeq 0.

For an old black hole with early radiation $E$ :

\text{small }D\text{ after scrambling} \quad\Rightarrow\quad I(Q:DE)\simeq 2S(Q).

The difference is the entangled side information $E$ .

The delay is the scrambling time:

t_*\sim {\beta\over 2\pi}\log S_{\rm BH}.

The cost is decoding complexity, which can be enormous.

Common pitfalls

“The message comes out before it enters.”
No. The recovery is possible only after the message has interacted with the black hole dynamics and after enough late radiation has been emitted. The minimal delay is controlled by scrambling.

“A single late Hawking quantum contains the message.”
No. The message is encoded in correlations among early radiation, late radiation, and the black hole state. A small subsystem by itself generally contains no useful information.

“Fast scrambling means fast decoding.”
No. Scrambling is a property of the black hole dynamics. Decoding is an operation performed by Bob on the radiation. The first can be fast while the second is computationally prohibitive.

“The old black hole literally reflects information from the horizon.”
No. The mirror is a quantum-information metaphor. The message is mixed into the black hole degrees of freedom and becomes recoverable because the black hole is already entangled with early radiation.

“Hayden–Preskill solves the information paradox by itself.”
Not quite. It assumes unitary black hole dynamics and rapid scrambling, then derives consequences for recovery. It does not derive unitarity from semiclassical gravity. The island and replica-wormhole developments address a different question: how the gravitational entropy calculation itself changes.

Exercises

Exercise 1. Scrambling time from an OTOC

Suppose an out-of-time-order commutator grows as

C(t)\sim {1\over S}e^{\lambda_L t}

until it becomes order one. Derive the scrambling time. What is $t_*$ if the chaos bound is saturated?

Solution

Scrambling occurs when

C(t_*)\sim 1.

Using the assumed growth,

{1\over S}e^{\lambda_L t_*}\sim 1.

Therefore

e^{\lambda_L t_*}\sim S,

and hence

t_*\sim {1\over \lambda_L}\log S.

If the chaos bound is saturated,

\lambda_L={2\pi\over \beta},

t_*\sim {\beta\over 2\pi}\log S.

For a black hole one identifies $S$ with $S_{\rm BH}$ up to order-one conventions.

Exercise 2. Schwarzschild hierarchy

In four-dimensional asymptotically flat Schwarzschild evaporation, use units with $G_N=\hbar=c=k_B=1$ . The inverse temperature and entropy scale as

\beta\sim M, \qquad S_{\rm BH}\sim M^2.

The evaporation time scales as

t_{\rm evap}\sim M^3.

Estimate $t_*/t_{\rm evap}$ for a large black hole.

Solution

The scrambling time is

t_*\sim \beta\log S_{\rm BH}.

Using

\beta\sim M, \qquad S_{\rm BH}\sim M^2,

we get

t_*\sim M\log M^2\sim 2M\log M.

Since

t_{\rm evap}\sim M^3,

we find

{t_*\over t_{\rm evap}} \sim {\log M\over M^2}.

Equivalently, since $S_{\rm BH}\sim M^2$ ,

{t_*\over t_{\rm evap}}\sim {\log S_{\rm BH}\over S_{\rm BH}}

up to constants. For a macroscopic black hole this is parametrically tiny.

Exercise 3. Decoupling implies recovery

Let $QCDE$ be a pure state. Show that

I(Q:DE)=2S(Q)-I(Q:C).

Explain why $I(Q:C)\simeq 0$ implies that the purifier of $Q$ lies in $DE$ .

Solution

The mutual information between $Q$ and $DE$ is

I(Q:DE)=S(Q)+S(DE)-S(QDE).

Since $QCDE$ is pure,

S(DE)=S(QC), \qquad S(QDE)=S(C).

Thus

I(Q:DE)=S(Q)+S(QC)-S(C).

Now use

I(Q:C)=S(Q)+S(C)-S(QC),

which implies

S(QC)=S(Q)+S(C)-I(Q:C).

Substituting gives

I(Q:DE)=S(Q)+S(Q)+S(C)-I(Q:C)-S(C),

I(Q:DE)=2S(Q)-I(Q:C).

If $I(Q:C)\simeq 0$ , then

I(Q:DE)\simeq 2S(Q),

which is the maximum possible mutual information between $Q$ and another system when $Q$ is purified by that system. Therefore the entanglement of $Q$ is carried by $DE$ . More precisely, Uhlmann’s theorem implies the existence of an isometry, or recovery map, acting on $DE$ that reconstructs a system entangled with $Q$ as the original message was.

Exercise 4. Qubit counting in the old-black-hole limit

Assume the decoupling error scales as

\epsilon_{\rm dec}\sim {d_A\over d_D}.

Alice throws in a $k$ -qubit message. Bob collects $m$ late radiation qubits after the scrambling time. How large must $m$ be to make the error at most $\epsilon$ ?

Solution

A $k$ -qubit message has Hilbert-space dimension

d_A=2^k.

If Bob collects $m$ late radiation qubits, then

d_D=2^m.

The assumed error estimate gives

\epsilon_{\rm dec}\sim {2^k\over 2^m}=2^{k-m}.

To have

\epsilon_{\rm dec}\lesssim \epsilon,

we need

2^{k-m}\lesssim \epsilon.

Taking logarithms,

k-m\lesssim \log_2\epsilon,

m\gtrsim k+\log_2{1\over \epsilon}.

Thus Bob needs only slightly more late radiation qubits than the number of message qubits, after the message has been scrambled.

Exercise 5. Why early radiation matters

Give an entropy-based explanation of why Bob’s access to early radiation $E$ changes the recovery problem.

Solution

Without early radiation, Bob tries to recover the message from $D$ alone. If $D$ is small, then in a typical unitary evolution the complementary system $C$ is much larger. The reference $Q$ remains mostly correlated with $C$ , so

I(Q:C)

is not small and decoupling fails.

With early radiation, Bob’s decoding system is $DE$ , not $D$ . For an old black hole, $E$ already purifies much of the black hole $B$ . After the message is scrambled, a small amount of late radiation $D$ can complete the information needed to reconstruct the message. In decoupling language, $Q$ becomes nearly uncorrelated with the remaining black hole $C$ :

I(Q:C)\simeq 0.

Then the purifier of $Q$ must be in $DE$ .

This is why the Hayden–Preskill mirror effect is specifically an old-black-hole phenomenon.

Exercise 6. Information-theoretic versus efficient recovery

Explain the difference between saying “the message is recoverable from the radiation” and saying “Bob can efficiently recover the message.” Why is this distinction important for firewalls?

Solution

The statement that the message is recoverable from the radiation usually means that there exists some recovery map acting on the radiation that reconstructs the message with small error. This is an information-theoretic statement. It follows from decoupling and does not by itself bound the computational complexity of the recovery map.

Efficient recovery is stronger. It requires a decoding algorithm whose time and resources are not enormous compared with the physical timescale of the problem. Harlow and Hayden argued that the decoding operations relevant to certain firewall thought experiments are generically exponentially hard in the black hole entropy. If so, an exterior observer cannot decode the early radiation quickly enough to perform the AMPS experiment before the black hole evaporates.

Thus the radiation may contain the information in principle while remaining practically, or even computationally, inaccessible to realistic observers.

Fast Scrambling and Hayden–Preskill Recovery

Guiding question

Thermalization is not scrambling

The Hayden–Preskill setup

The decoupling criterion

Why the old black hole acts like a mirror

Why young black holes are different

Channel capacity viewpoint

What Bob must actually do

Relation to complementarity and no-cloning

Relation to islands and entanglement wedges

A useful four-line summary

Common pitfalls

Exercises

Exercise 1. Scrambling time from an OTOC

Exercise 2. Schwarzschild hierarchy

Exercise 3. Decoupling implies recovery

Exercise 4. Qubit counting in the old-black-hole limit

Exercise 5. Why early radiation matters

Exercise 6. Information-theoretic versus efficient recovery

Further reading

Next