Skip to content

Hamiltonian systems with constraints

Audience: graduate students with prior exposure to Lagrangian/Hamiltonian mechanics and basic field theory.
Goal: learn the Dirac–Bergmann treatment of constrained Hamiltonian systems and see it in action for Maxwell theory, the Proca (massive vector) field, and general relativity (ADM).


In ordinary (regular) Lagrangian mechanics, the Legendre map

(qa,q˙a)(qa,pa),pa=Lq˙a(q^a,\dot q^a)\mapsto (q^a,p_a),\qquad p_a=\frac{\partial L}{\partial\dot q^a}

is invertible because the Hessian

Wab(q,q˙)=2Lq˙aq˙bW_{ab}(q,\dot q)=\frac{\partial^2L}{\partial\dot q^a\,\partial\dot q^b}

has detW0\det W\neq 0. One can solve q˙=q˙(q,p)\dot q=\dot q(q,p) and define the Hamiltonian H=pq˙LH=p\dot q-L.

A constrained system arises when the Hessian is singular: detW=0\det W=0. Then the momenta are not independent; they satisfy relations

ϕα(q,p)0,\phi_\alpha(q,p)\approx 0,

called primary constraints. The symbol \approx (“weak equality”) means the relation holds on the constraint surface in phase space, but you should not use it inside Poisson brackets until you have computed them.

In gauge theories the singularity is not an accident: it reflects redundancy in the description (gauge symmetry). In massive theories (e.g. Proca) singularity can also occur because some variables are nondynamical multipliers even without gauge symmetry.


2. Dirac–Bergmann algorithm in a nutshell

Section titled “2. Dirac–Bergmann algorithm in a nutshell”

For canonical fields (qa(x),pa(x))(q^a(\mathbf{x}),p_a(\mathbf{x})),

{qa(x),pb(y)}=δabδ(3)(xy),{qa,qb}={pa,pb}=0.\{q^a(\mathbf{x}),p_b(\mathbf{y})\}=\delta^a{}_b\,\delta^{(3)}(\mathbf{x}-\mathbf{y}), \qquad \{q^a,q^b\}=\{p_a,p_b\}=0.

Given a Lagrangian density L(q,q˙,q)\mathcal{L}(q,\dot q,\nabla q):

  1. Define canonical momenta

    pa=Lq˙a.p_a=\frac{\partial\mathcal{L}}{\partial \dot q^a}.

    Relations among (q,p)(q,p) that do not determine velocities are primary constraints ϕα0\phi_\alpha\approx 0.

  2. Canonical Hamiltonian (Legendre transform where possible)

    Hc=paq˙aL.\mathcal{H}_c = p_a\dot q^a - \mathcal{L}.
  3. Total Hamiltonian

    HT=d3x(Hc+uα(x)ϕα(x)),H_T = \int d^3x\left(\mathcal{H}_c + u^\alpha(\mathbf{x})\,\phi_\alpha(\mathbf{x})\right),

    where uαu^\alpha are Lagrange multipliers enforcing the primary constraints.

  4. Consistency conditions Require constraints be preserved under time evolution:

    ϕ˙α(x)={ϕα(x),HT}0.\dot\phi_\alpha(\mathbf{x})=\{\phi_\alpha(\mathbf{x}),H_T\}\approx 0.

    This may:

    • produce secondary constraints, tertiary, etc.; and/or
    • fix some multipliers uαu^\alpha.

Continue until closure.

Let {ΦA}\{\Phi_A\} be the complete set of constraints.

  • First-class constraint: {ΦA,ΦB}0\{\Phi_A,\Phi_B\}\approx 0 for all BB.
    These generate gauge transformations (redundancies).

  • Second-class constraint: the matrix (kernel)

    CAB(x,y)={ΦA(x),ΦB(y)}C_{AB}(\mathbf{x},\mathbf{y})=\{\Phi_A(\mathbf{x}),\Phi_B(\mathbf{y})\}

    is (functionally) invertible on the constraint surface.
    These do not generate gauge; they simply remove phase-space directions.

2.4 Dirac bracket (for second-class constraints)

Section titled “2.4 Dirac bracket (for second-class constraints)”

If χA0\chi_A\approx 0 are second-class and CABC_{AB} is invertible with inverse CABC^{AB},

{F,G}D={F,G}d3xd3y  {F,χA(x)}CAB(x,y){χB(y),G}.\{F,G\}_D = \{F,G\} - \int d^3x\,d^3y\; \{F,\chi_A(\mathbf{x})\}\,C^{AB}(\mathbf{x},\mathbf{y})\,\{\chi_B(\mathbf{y}),G\}.

Then {F,χA}D=0\{F,\chi_A\}_D=0 for all FF, so you may set χA=0\chi_A=0 strongly after switching to Dirac brackets.

Let NN be the number of configuration fields per space point (so phase space has dimension 2N2N). Let N1N_1 be the number of first-class constraints and N2N_2 the number of second-class constraints (per point, in an appropriate local sense). Then

Nphys=NN112N2.N_{\text{phys}} = N - N_1 - \frac12 N_2.

Equivalently, in phase space:

dimΓphys=2N2N1N2.\dim \Gamma_{\text{phys}} = 2N - 2N_1 - N_2.

3. Worked example I: Maxwell theory (massless spin-1)

Section titled “3. Worked example I: Maxwell theory (massless spin-1)”

Take vacuum Maxwell in flat space:

L=14FμνFμν,Fμν=μAννAμ.\mathcal{L} = -\frac14 F_{\mu\nu}F^{\mu\nu}, \qquad F_{\mu\nu}=\partial_\mu A_\nu-\partial_\nu A_\mu.

With our signature,

FμνFμν=2F0iF0i+FijFij.F_{\mu\nu}F^{\mu\nu}=-2F_{0i}F_{0i}+F_{ij}F_{ij}.

Define

F0i=A˙iiA0,Bk=12ϵkijFij,FijFij=2B2.F_{0i}=\dot A_i-\partial_iA_0, \qquad B^k=\frac12\epsilon^{kij}F_{ij}, \qquad F_{ij}F_{ij}=2\mathbf{B}^2.

Then

L=12(A˙iiA0)212B2=12E212B2,EiF0i.\mathcal{L} = \frac12(\dot A_i-\partial_iA_0)^2-\frac12\mathbf{B}^2 = \frac12\mathbf{E}^2-\frac12\mathbf{B}^2, \quad E_i\equiv F_{0i}.

3.2 Canonical momenta and primary constraint

Section titled “3.2 Canonical momenta and primary constraint”

Define

πμ=LA˙μ.\pi^\mu=\frac{\partial\mathcal{L}}{\partial \dot A_\mu}.

There is no A˙0\dot A_0 in L\mathcal{L}, hence

π0=0ϕ1(x)π0(x)0(primary constraint).\pi^0 = 0 \quad\Rightarrow\quad \phi_1(\mathbf{x})\equiv \pi^0(\mathbf{x})\approx 0 \qquad\text{(primary constraint).}

For i=1,2,3i=1,2,3,

πi=LA˙i=A˙iiA0=F0i=Ei.\pi^i=\frac{\partial\mathcal{L}}{\partial \dot A_i} =\dot A_i-\partial_iA_0 =F_{0i} =E_i.

Thus πi\pi^i is the electric field.

Compute the Hamiltonian density

Hc=πiA˙iL.\mathcal{H}_c=\pi^i\dot A_i-\mathcal{L}.

Solve A˙i=πi+iA0\dot A_i=\pi^i+\partial_iA_0:

πiA˙i=π2+πiiA0.\pi^i\dot A_i = \pi^2+\pi^i\partial_iA_0.

Since L=12π212B2\mathcal{L}=\tfrac12\pi^2-\tfrac12\mathbf{B}^2,

Hc=12π2+12B2+πiiA0.\mathcal{H}_c = \frac12\pi^2+\frac12\mathbf{B}^2+\pi^i\partial_iA_0.

Integrating by parts (dropping boundary terms),

d3xπiiA0=d3xA0iπi,\int d^3x\,\pi^i\partial_iA_0 = -\int d^3x\,A_0\,\partial_i\pi^i,

so the canonical Hamiltonian is

Hc=d3x[12(π2+B2)A0iπi].H_c = \int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) - A_0\,\partial_i\pi^i \right].

3.4 Total Hamiltonian and Gauss constraint

Section titled “3.4 Total Hamiltonian and Gauss constraint”

Add the primary constraint with multiplier u(x)u(\mathbf{x}):

HT=Hc+d3xu(x)π0(x).H_T=H_c+\int d^3x\,u(\mathbf{x})\,\pi^0(\mathbf{x}).

Preserve π00\pi^0\approx 0:

π˙0(x)={π0(x),HT}=δHTδA0(x)=iπi(x)0.\dot\pi^0(\mathbf{x}) = \{\pi^0(\mathbf{x}),H_T\} = -\frac{\delta H_T}{\delta A_0(\mathbf{x})} = \partial_i\pi^i(\mathbf{x}) \approx 0.

This yields the secondary constraint

ϕ2(x)iπi(x)0,\phi_2(\mathbf{x})\equiv \partial_i\pi^i(\mathbf{x})\approx 0,

i.e. Gauss’s law in vacuum.

No further constraints arise; the multiplier u(x)u(\mathbf{x}) remains undetermined.

3.5 Constraint algebra and first-class nature

Section titled “3.5 Constraint algebra and first-class nature”

Using canonical brackets,

{π0(x),ϕ2(y)}=0,{ϕ2(x),ϕ2(y)}=0.\{\pi^0(\mathbf{x}),\phi_2(\mathbf{y})\}=0, \qquad \{\phi_2(\mathbf{x}),\phi_2(\mathbf{y})\}=0.

Thus ϕ1\phi_1 and ϕ2\phi_2 are first-class.

3.6 Gauge generator (one gauge function, two constraints)

Section titled “3.6 Gauge generator (one gauge function, two constraints)”

A gauge parameter is a function ϵ(t,x)\epsilon(t,\mathbf{x}). The appropriate generator can be written (Castellani’s algorithm) as

G[ϵ]=d3x(ϵ˙(x)π0(x)+ϵ(x)ϕ2(x)).G[\epsilon] = \int d^3x\Big(-\dot\epsilon(\mathbf{x})\,\pi^0(\mathbf{x})+\epsilon(\mathbf{x})\,\phi_2(\mathbf{x})\Big).

Then

δA0={A0,G}=ϵ˙,δAi={Ai,G}=iϵ,\delta A_0=\{A_0,G\}=-\dot\epsilon, \qquad \delta A_i=\{A_i,G\}=-\partial_i\epsilon,

so δAμ=μϵ\delta A_\mu=-\partial_\mu\epsilon, the usual U(1)U(1) gauge symmetry.

We have N=4N=4 configuration fields AμA_\mu. Constraints: N1=2N_1=2 first-class and N2=0N_2=0. Hence

Nphys=42=2,N_{\text{phys}} = 4 - 2 = 2,

the two transverse photon polarizations.

3.8 Optional: explicit reduction in Coulomb gauge

Section titled “3.8 Optional: explicit reduction in Coulomb gauge”

To see the “constraint + gauge” removal explicitly, decompose

Ai=AiT+iα,iAiT=0,A_i = A_i^T + \partial_i\alpha, \qquad \partial_iA_i^T=0, πi=πTi+iβ,iπTi=0.\pi^i = \pi_T^i + \partial^i\beta, \qquad \partial_i\pi_T^i=0.

Then Gauss’s constraint is iπi=2β0\partial_i\pi^i=\nabla^2\beta\approx 0, which removes the longitudinal momentum β\beta (up to boundary conditions). The longitudinal coordinate α\alpha is removed by gauge.

If you impose Coulomb gauge χiAi0\chi\equiv \partial_iA_i\approx 0, then (χ,ϕ2)(\chi,\phi_2) form a second-class pair:

{χ(x),ϕ2(y)}={iAi(x),jπj(y)}=2δ(3)(xy),\{\chi(\mathbf{x}),\phi_2(\mathbf{y})\} = \{\partial_iA_i(\mathbf{x}),\partial_j\pi^j(\mathbf{y})\} = \nabla^2\delta^{(3)}(\mathbf{x}-\mathbf{y}),

which is invertible (as an operator) after specifying boundary conditions. The reduced Hamiltonian becomes a Hamiltonian for the transverse fields only:

Hred=d3x12(πT2+B2).H_{\text{red}}=\int d^3x\,\frac12\big(\pi_T^2+\mathbf{B}^2\big).

4. Worked example II: Proca (massive spin-1)

Section titled “4. Worked example II: Proca (massive spin-1)”

With our signature (+++)(-+++), a convenient Proca Lagrangian that leads to a positive-energy Hamiltonian is

L=14FμνFμν12m2AμAμ.\mathcal{L} = -\frac14 F_{\mu\nu}F^{\mu\nu} -\frac12 m^2 A_\mu A^\mu.

The field equation is

μFμνm2Aν=0,\partial_\mu F^{\mu\nu} - m^2 A^\nu = 0,

and taking ν\partial_\nu gives the on-shell constraint

νAν=0(m0).\partial_\nu A^\nu = 0 \quad (m\neq 0).

Using AμAμ=A02+Ai2A_\mu A^\mu=-A_0^2+A_i^2, we get

12m2AμAμ=+12m2A0212m2Ai2.-\frac12 m^2 A_\mu A^\mu = +\frac12 m^2 A_0^2 - \frac12 m^2 A_i^2.

Therefore

L=12(A˙iiA0)212B2+12m2A0212m2Ai2.\mathcal{L} = \frac12(\dot A_i-\partial_iA_0)^2 -\frac12\mathbf{B}^2 +\frac12 m^2 A_0^2 -\frac12 m^2 A_i^2.

4.3 Canonical momenta and primary constraint

Section titled “4.3 Canonical momenta and primary constraint”

Exactly as in Maxwell,

π0=LA˙0=0ϕ1(x)π0(x)0,\pi^0=\frac{\partial\mathcal{L}}{\partial\dot A_0}=0 \quad\Rightarrow\quad \phi_1(\mathbf{x})\equiv \pi^0(\mathbf{x})\approx 0,

and

πi=LA˙i=A˙iiA0.\pi^i=\frac{\partial\mathcal{L}}{\partial\dot A_i}=\dot A_i-\partial_iA_0.

Compute

Hc=πiA˙iL,A˙i=πi+iA0.\mathcal{H}_c=\pi^i\dot A_i-\mathcal{L}, \qquad \dot A_i=\pi^i+\partial_iA_0.

One finds

Hc=12π2+12B2+12m2Ai2+πiiA012m2A02.\mathcal{H}_c = \frac12\pi^2+\frac12\mathbf{B}^2 +\frac12 m^2 A_i^2 +\pi^i\partial_iA_0 -\frac12 m^2 A_0^2.

Integrating by parts,

Hc=d3x[12(π2+B2)+12m2Ai2A0iπi12m2A02].H_c=\int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) +\frac12 m^2 A_i^2 - A_0\,\partial_i\pi^i -\frac12 m^2 A_0^2 \right].

4.5 Total Hamiltonian and secondary constraint

Section titled “4.5 Total Hamiltonian and secondary constraint”

Total Hamiltonian:

HT=Hc+d3xu(x)π0(x).H_T = H_c+\int d^3x\,u(\mathbf{x})\,\pi^0(\mathbf{x}).

Preserve π00\pi^0\approx 0:

π˙0(x)=δHTδA0(x)=iπi(x)+m2A0(x)0.\dot\pi^0(\mathbf{x}) = -\frac{\delta H_T}{\delta A_0(\mathbf{x})} = \partial_i\pi^i(\mathbf{x}) + m^2 A_0(\mathbf{x}) \approx 0.

So the secondary constraint is

ϕ2(x)iπi(x)+m2A0(x)0.\phi_2(\mathbf{x})\equiv \partial_i\pi^i(\mathbf{x})+m^2A_0(\mathbf{x})\approx 0.

4.6 Second-class structure (no gauge symmetry)

Section titled “4.6 Second-class structure (no gauge symmetry)”

Compute the constraint bracket:

{ϕ1(x),ϕ2(y)}={π0(x),iπi(y)+m2A0(y)}=m2δ(3)(xy)0.\{\phi_1(\mathbf{x}),\phi_2(\mathbf{y})\} = \{\pi^0(\mathbf{x}),\partial_i\pi^i(\mathbf{y})+m^2A_0(\mathbf{y})\} = -m^2\delta^{(3)}(\mathbf{x}-\mathbf{y})\neq 0.

Thus ϕ1,ϕ2\phi_1,\phi_2 are second-class: there is no gauge symmetry. Consistency now fixes the multiplier uu rather than leaving it arbitrary.

4.7 Eliminating A0A_0 and the reduced Hamiltonian

Section titled “4.7 Eliminating A0A_0A0​ and the reduced Hamiltonian”

The constraint ϕ2=0\phi_2=0 is algebraic in A0A_0:

A0=1m2iπi.A_0 = -\frac{1}{m^2}\partial_i\pi^i.

Substitute into the Hamiltonian. The A0A_0-dependent terms combine as

A0iπi12m2A02=12m2(iπi)2.- A_0\,\partial_i\pi^i - \frac12 m^2 A_0^2 = \frac{1}{2m^2}(\partial_i\pi^i)^2.

Hence the reduced Hamiltonian is

Hred=d3x[12(π2+B2)+12m2Ai2+12m2(iπi)2],H_{\text{red}} = \int d^3x\left[ \frac12(\pi^2+\mathbf{B}^2) +\frac12 m^2 A_i^2 +\frac{1}{2m^2}(\partial_i\pi^i)^2 \right],

which is manifestly bounded below.

Here N=4N=4, N1=0N_1=0, N2=2N_2=2. Thus

Nphys=4122=3,N_{\text{phys}}=4-\frac12\cdot 2 = 3,

corresponding to helicities 1,0,+1-1,0,+1 of a massive spin-1 particle.

4.9 Optional: Stückelberg trick and “gauge vs second-class”

Section titled “4.9 Optional: Stückelberg trick and “gauge vs second-class””

Introduce a scalar φ\varphi and replace

AμAμ+1mμφ.A_\mu \to A_\mu + \frac{1}{m}\partial_\mu\varphi.

Then the Proca mass term becomes gauge invariant under

δAμ=μλ,δφ=mλ.\delta A_\mu=\partial_\mu\lambda, \qquad \delta\varphi = -m\lambda.

The theory is now gauge invariant (first-class constraints reappear), but it contains an extra field φ\varphi. After gauge fixing (e.g. φ=0\varphi=0) you recover Proca with 3 physical DOF. This is a useful conceptual bridge: second-class constraints can be viewed as gauge-fixed first-class systems (under appropriate extensions).


5. General relativity as a constrained Hamiltonian system (ADM)

Section titled “5. General relativity as a constrained Hamiltonian system (ADM)”

The Hamiltonian formulation of GR is the prototype of a field theory with:

  • singular Lagrangian (lapse and shift are nondynamical),
  • first-class constraints (encoding diffeomorphism invariance),
  • a nontrivial constraint algebra with structure functions (hypersurface deformation algebra).

We sketch the derivation carefully enough to see where each ingredient comes from.

5.1 Einstein–Hilbert action and 3+1 split

Section titled “5.1 Einstein–Hilbert action and 3+1 split”

Start from the Einstein–Hilbert action (with cosmological constant Λ\Lambda)

S=116πGd4xg(R2Λ)+Sboundary.S = \frac{1}{16\pi G}\int d^4x\,\sqrt{-g}\,(R-2\Lambda) + S_{\text{boundary}}.

The boundary term (e.g. Gibbons–Hawking–York) ensures a well-posed variational principle when fixing the induced metric on the boundary.

Assume spacetime is foliated by spacelike hypersurfaces Σt\Sigma_t, with coordinates xix^i on each Σt\Sigma_t. The spacetime metric can be written in ADM form:

ds2=N2dt2+hij(dxi+Nidt)(dxj+Njdt),ds^2 = -N^2dt^2 + h_{ij}(dx^i+N^i dt)(dx^j+N^j dt),

where:

  • hij(t,x)h_{ij}(t,\mathbf{x}) is the induced spatial metric on Σt\Sigma_t,
  • N(t,x)N(t,\mathbf{x}) is the lapse,
  • Ni(t,x)N^i(t,\mathbf{x}) is the shift (with Ni=hijNjN_i=h_{ij}N^j).

Useful identities:

g=Nh,\sqrt{-g}=N\sqrt{h},

where h=det(hij)h=\det(h_{ij}).

Define the covariant derivative DiD_i compatible with hijh_{ij}: Dkhij=0D_k h_{ij}=0.

The extrinsic curvature of Σt\Sigma_t embedded in spacetime is

Kij=12N(h˙ijDiNjDjNi),K=hijKij.K_{ij} = \frac{1}{2N}\left(\dot h_{ij}-D_iN_j-D_jN_i\right), \qquad K=h^{ij}K_{ij}.

This shows explicitly that h˙ij\dot h_{ij} appears linearly in KijK_{ij}, while N˙\dot N and N˙i\dot N^i do not appear at all.

A standard result of the Gauss–Codazzi decomposition (up to total derivatives absorbed by SboundaryS_{\text{boundary}}) is:

gR=Nh((3)R+KijKijK2)+(total derivative).\sqrt{-g}\,R = N\sqrt{h}\left({}^{(3)}R + K_{ij}K^{ij} - K^2\right) +\text{(total derivative)}.

Therefore, dropping total derivatives already accounted for by boundary terms, the ADM Lagrangian density is

LADM=h16πGN((3)R+KijKijK22Λ).\mathcal{L}_{\text{ADM}} = \frac{\sqrt{h}}{16\pi G}\,N\left({}^{(3)}R + K_{ij}K^{ij} - K^2 - 2\Lambda\right).

The canonical momentum conjugate to hijh_{ij} is

πij(x)=LADMh˙ij(x).\pi^{ij}(\mathbf{x}) = \frac{\partial \mathcal{L}_{\text{ADM}}}{\partial \dot h_{ij}(\mathbf{x})}.

Since h˙ij\dot h_{ij} enters only through KijK_{ij}, and

Kklh˙ij=12Nδi(kδjl),\frac{\partial K_{kl}}{\partial \dot h_{ij}}=\frac{1}{2N}\delta^i{}_{(k}\delta^j{}_{l)},

one finds

πij=h16πG(KijhijK).\pi^{ij} = \frac{\sqrt{h}}{16\pi G}\left(K^{ij}-h^{ij}K\right).

Taking the trace πhijπij\pi\equiv h_{ij}\pi^{ij} gives

π=h16πG2KK=8πGhπ.\pi = -\frac{\sqrt{h}}{16\pi G}\,2K \quad\Rightarrow\quad K = -\frac{8\pi G}{\sqrt{h}}\,\pi.

You can invert to express KijK_{ij} in terms of πij\pi^{ij}:

Kij=16πGh(πij12hijπ).K_{ij} = \frac{16\pi G}{\sqrt{h}}\left(\pi_{ij}-\frac12 h_{ij}\pi\right).

Primary constraints: because N˙\dot N and N˙i\dot N^i do not appear in LADM\mathcal{L}_{\text{ADM}},

πN(x)LN˙=0,πi(x)LN˙i=0.\pi_N(\mathbf{x}) \equiv \frac{\partial\mathcal{L}}{\partial \dot N}=0,\qquad \pi_i(\mathbf{x}) \equiv \frac{\partial\mathcal{L}}{\partial \dot N^i}=0.

Thus,

πN0,πi0\pi_N\approx 0,\qquad \pi_i\approx 0

are primary constraints.

5.5 Canonical Hamiltonian and the ADM constraints

Section titled “5.5 Canonical Hamiltonian and the ADM constraints”

The canonical Hamiltonian is

Hc=d3x(πijh˙ijLADM),H_c=\int d^3x\,\left(\pi^{ij}\dot h_{ij}-\mathcal{L}_{\text{ADM}}\right),

where h˙ij\dot h_{ij} should be expressed using

h˙ij=2NKij+DiNj+DjNi.\dot h_{ij}=2NK_{ij}+D_iN_j+D_jN_i.

A key computation uses integration by parts:

d3xπij(DiNj+DjNi)=2d3xNjDiπij\int d^3x\,\pi^{ij}(D_iN_j+D_jN_i) = -2\int d^3x\,N_j D_i\pi^{ij}

(up to boundary terms). After rewriting KijK_{ij} in terms of πij\pi^{ij}, one arrives at the standard ADM form

Hc=d3x(NH+NiHi)+HΣ.H_c = \int d^3x\left( N\,\mathcal{H}_\perp + N^i\,\mathcal{H}_i\right) + H_{\partial\Sigma}.

Here HΣH_{\partial\Sigma} is a boundary term (e.g. ADM energy for asymptotically flat spacetimes). The bulk constraint densities are:

Momentum (diffeomorphism) constraint

  Hi=2Djπij  \boxed{\;\mathcal{H}_i = -2 D_j \pi_i{}^{j}\;}

Hamiltonian (scalar) constraint

  H=16πGh(πijπij12π2)h16πG((3)R2Λ)  \boxed{\; \mathcal{H}_\perp = \frac{16\pi G}{\sqrt{h}}\left(\pi_{ij}\pi^{ij}-\frac12\pi^2\right) -\frac{\sqrt{h}}{16\pi G}\left({}^{(3)}R-2\Lambda\right) \;}

(again, up to convention-dependent signs/factors).

5.6 Total Hamiltonian and secondary constraints

Section titled “5.6 Total Hamiltonian and secondary constraints”

The total Hamiltonian adds the primary constraints:

HT=Hc+d3x(uπN+uiπi).H_T = H_c + \int d^3x\left(u\,\pi_N + u^i \pi_i\right).

Preserving πN0\pi_N\approx 0 gives

π˙N(x)={πN(x),HT}=δHTδN(x)=H(x)0,\dot\pi_N(\mathbf{x}) = \{\pi_N(\mathbf{x}),H_T\} = -\frac{\delta H_T}{\delta N(\mathbf{x})} = -\mathcal{H}_\perp(\mathbf{x}) \approx 0,

hence the Hamiltonian constraint

H(x)0.\mathcal{H}_\perp(\mathbf{x})\approx 0.

Preserving πi0\pi_i\approx 0 gives

π˙i(x)=δHTδNi(x)=Hi(x)0,\dot\pi_i(\mathbf{x}) = -\frac{\delta H_T}{\delta N^i(\mathbf{x})} = -\mathcal{H}_i(\mathbf{x}) \approx 0,

hence the momentum constraints

Hi(x)0.\mathcal{H}_i(\mathbf{x})\approx 0.

No new independent constraints appear beyond these (for pure GR); instead, consistency fixes nothing because the theory is gauge invariant (diffeomorphisms).

5.7 Smeared constraints and the constraint algebra

Section titled “5.7 Smeared constraints and the constraint algebra”

It is cleaner to use smeared functionals:

H[N]d3xN(x)H(x),D[N]d3xNi(x)Hi(x).\mathcal{H}[N] \equiv \int d^3x\,N(\mathbf{x})\,\mathcal{H}_\perp(\mathbf{x}), \qquad \mathcal{D}[\vec{N}] \equiv \int d^3x\,N^i(\mathbf{x})\,\mathcal{H}_i(\mathbf{x}).

Then (schematically) the Poisson brackets close as:

{D[N],D[M]}=D[LNM],\{\mathcal{D}[\vec{N}],\mathcal{D}[\vec{M}]\} = \mathcal{D}[\mathcal{L}_{\vec{N}}\vec{M}], {D[N],H[M]}=H[LNM],\{\mathcal{D}[\vec{N}],\mathcal{H}[M]\} = \mathcal{H}[\mathcal{L}_{\vec{N}}M], {H[N],H[M]}=D ⁣[hij(NjMMjN)].\{\mathcal{H}[N],\mathcal{H}[M]\} = \mathcal{D}\!\left[h^{ij}(N\partial_j M - M\partial_j N)\right].

This is the hypersurface deformation algebra (often called “Dirac algebra”). It is not a Lie algebra with constant structure constants; it has structure functions involving hijh^{ij}.

The closure implies that H\mathcal{H}_\perp and Hi\mathcal{H}_i are first-class (together with the primary constraints πN,πi\pi_N,\pi_i).

  • D[N]\mathcal{D}[\vec{N}] generates spatial diffeomorphisms on Σt\Sigma_t: {hij,D[N]}=LNhij,{πij,D[N]}=LNπij.\{h_{ij},\mathcal{D}[\vec{N}]\}=\mathcal{L}_{\vec{N}}h_{ij}, \qquad \{\pi^{ij},\mathcal{D}[\vec{N}]\}=\mathcal{L}_{\vec{N}}\pi^{ij}.
  • H[N]\mathcal{H}[N] generates normal deformations of the hypersurface (time reparametrizations / refoliations).

A precise mapping between (H,Hi)(\mathcal{H}_\perp,\mathcal{H}_i) and spacetime diffeomorphisms requires care, because the algebra closes with structure functions; nonetheless, the standard viewpoint is that these first-class constraints encode the redundancy under diffeomorphisms.

In 3+1D, the configuration variable hijh_{ij} is a symmetric 3×33\times 3 tensor: N=6N=6 per point.

Constraints:

  • primary: πN0\pi_N\approx 0 (1), πi0\pi_i\approx 0 (3),
  • secondary: H0\mathcal{H}_\perp\approx 0 (1), Hi0\mathcal{H}_i\approx 0 (3).

Altogether there are 8 constraints; however, the standard DOF counting for GR focuses on the true canonical pair (hij,πij)(h_{ij},\pi^{ij}) and treats N,NiN,N^i as multipliers.

On the (hij,πij)(h_{ij},\pi^{ij}) phase space:

  • 2N=122N = 12 phase-space dimensions per point,
  • there are 44 independent first-class constraints (H,Hi)(\mathcal{H}_\perp,\mathcal{H}_i).

Thus

dimΓphys=122×4=4,Nphys=42=2.\dim \Gamma_{\text{phys}} = 12 - 2\times 4 = 4, \qquad N_{\text{phys}} = \frac{4}{2}=2.

These are the two polarizations of the graviton (gravitational waves) in 4D.

5.10 Linearized check (TT gauge intuition)

Section titled “5.10 Linearized check (TT gauge intuition)”

Linearize around Minkowski: gμν=ημν+hμνg_{\mu\nu}=\eta_{\mu\nu}+h_{\mu\nu}. In harmonic gauge one can reduce to transverse-traceless (TT) components hijTTh_{ij}^{\text{TT}} satisfying a wave equation. The TT condition removes gauge redundancy and constraints, leaving two propagating modes—consistent with the Hamiltonian count above.

Details: harmonic gauge ⇒ TT wave equation and the “2 polarizations” count

Start from

gμν=ημν+hμν,hμν1.g_{\mu\nu}=\eta_{\mu\nu}+h_{\mu\nu},\qquad |h_{\mu\nu}|\ll 1.

Infinitesimal diffeomorphisms xμxμξμ(x)x^\mu\to x^\mu-\xi^\mu(x) act as a gauge symmetry:

δhμν=μξν+νξμ.\delta h_{\mu\nu}=\partial_\mu\xi_\nu+\partial_\nu\xi_\mu.

It is convenient to use the trace-reversed field

hημνhμν,hˉμνhμν12ημνh.h\equiv \eta^{\mu\nu}h_{\mu\nu},\qquad \bar h_{\mu\nu}\equiv h_{\mu\nu}-\frac12\,\eta_{\mu\nu}h.

The harmonic (Lorenz) gauge condition is

μhˉμν=0.\partial^\mu \bar h_{\mu\nu}=0.

In this gauge, the vacuum linearized Einstein equations simplify to

hˉμν=0,ηρσρσ.\Box\,\bar h_{\mu\nu}=0,\qquad \Box\equiv \eta^{\rho\sigma}\partial_\rho\partial_\sigma.

So the radiative degrees propagate as massless waves.

Harmonic gauge does not fully fix the gauge: it is preserved by residual transformations with

ξμ=0.\Box\,\xi^\mu=0.

For a plane wave hˉμν=εμνeikx\bar h_{\mu\nu}=\varepsilon_{\mu\nu}e^{ik\cdot x}, the field equation gives k2=0k^2=0 and the gauge condition gives transversality kμεμν=0k^\mu\varepsilon_{\mu\nu}=0. Using the residual gauge freedom one can impose the stronger TT conditions (for a wave moving along zz):

h0μ=0,ihij=0,hii=0.h_{0\mu}=0,\qquad \partial^i h_{ij}=0,\qquad h^i{}_i=0.

The only nonzero components then live in the 2×22\times2 block transverse to the propagation direction and satisfy

hijTT=0.\Box\,h^{TT}_{ij}=0.

A standard basis is

hxxTT=hyyTTh+,hxyTT=hyxTTh×,h^{TT}_{xx}=-h^{TT}_{yy}\equiv h_+,\qquad h^{TT}_{xy}=h^{TT}_{yx}\equiv h_\times,

i.e. the two gravitational-wave polarizations.

This is the linearized counterpart of the Hamiltonian counting: the constraints remove non-propagating components and the gauge symmetry quotients out redundancies, leaving two physical modes (in 4D), exactly as in §5.9.


6. Conceptual comparison: Maxwell vs Proca vs GR

Section titled “6. Conceptual comparison: Maxwell vs Proca vs GR”

6.1 Why first-class “removes more” than second-class

Section titled “6.1 Why first-class “removes more” than second-class”

A single first-class constraint does two things:

  1. it restricts to the constraint surface, and
  2. it generates a gauge flow—points along that flow are physically equivalent.

Therefore each first-class constraint removes two phase-space dimensions (one for the surface, one for the orbit), while each second-class constraint removes only one.

  • Maxwell: constraints are first-class \Rightarrow gauge symmetry \Rightarrow 2 physical DOF.
  • Proca: constraints are second-class \Rightarrow no gauge symmetry \Rightarrow 3 physical DOF.
  • GR: constraints are first-class \Rightarrow diffeomorphism redundancy \Rightarrow 2 physical DOF in 4D.

Both Maxwell and Proca have π0=0\pi^0=0 because A0A_0 has no time derivative.
The difference is what happens next:

  • Maxwell: preservation produces Gauss law iπi0\partial_i\pi^i\approx 0 (first-class).
  • Proca: preservation produces iπi+m2A00\partial_i\pi^i+m^2A_0\approx 0 (second-class with π0\pi^0).

In GR, lapse and shift are nondynamical: πN=πi=0\pi_N=\pi_i=0. Their preservation produces Hamiltonian/momentum constraints, all first-class.


  1. Maxwell with sources: Add AμJμ-A_\mu J^\mu to the Maxwell Lagrangian and show that Gauss’s law becomes iπi=ρ\partial_i\pi^i=\rho. Discuss which parts of the constraint structure change.

  2. Coulomb gauge Dirac bracket: In Maxwell theory, impose Coulomb gauge iAi=0\partial_iA_i=0 and compute the Dirac bracket for the transverse fields.

  3. Proca Dirac bracket: Treat π0\pi^0 and iπi+m2A0\partial_i\pi^i+m^2A_0 as second-class and compute the Dirac bracket for A0A_0 and AiA_i.

  4. ADM constraint algebra: Verify at least one of the ADM bracket relations using smeared constraints and integration by parts.

  5. GR DOF in DD dimensions: Generalize the ADM DOF count to DD spacetime dimensions and show that the number of propagating graviton DOF is D(D3)/2D(D-3)/2.


  • P. A. M. Dirac, Lectures on Quantum Mechanics (constrained Hamiltonian systems).
  • M. Henneaux & C. Teitelboim, Quantization of Gauge Systems (modern, systematic).
  • K. Sundermeyer, Constrained Dynamics (classic).
  • R. M. Wald, General Relativity (ADM, constraints, canonical structure).
  • E. Poisson, A Relativist’s Toolkit (3+1 tools, extrinsic curvature).
  • Arnowitt–Deser–Misner (ADM) original papers/lectures (historical source).
  • 梁灿彬, 周彬. 微分几何入门与广义相对论(下册)第二版. 科学出版社.