week-7-relativistic-quantum-mechanics

Learning goals:
- Connection between spin and angular momentum
- Non-relativistic limit of the Dirac equation
- Existence of antiparticles, many-body viewpoint
- Chirality, helicity
- Dirac equation in a classical electromagnetic field, first relativistic corrections

Angular momentum and spin

The two-fold degeneracy of $E$ indicates that there must be another observable which commutes with $H_D$ and $\mathbf p$. The Hamiltonian is rotationally symmetric, so we might try the orbital angular momentum operator \[ \hat{\mathbf L}=(\hat{\mathbf x}\times\hat{\mathbf p}), \] which has the components $\hat L^i = \varepsilon^{ijk} \hat x^j \hat p^k$, but it is not a constant of motion: \[ [\hat H_D, \hat{\mathbf L}] = -i {\bm \alpha} \times \hat{\mathbf p}, \] in component notation $[\hat H_D, \hat L^i] = -i \varepsilon^{ijk} \alpha^j \hat p^k$.

We define a spin-operator in the Dirac-Pauli representation as \[ \hat{\mathbf S} = \frac{1}{2}\begin{pmatrix} \bm\sigma & 0\\ 0 & \bm\sigma \end{pmatrix}. \] It can be readily justified by referring to the eigenstates of the Dirac equation in the rest frame. However, also spin by itself is not a good quantum number: \[ [\hat H_D, \hat{\mathbf S}] = i {\bm \alpha}\times \hat{\mathbf p}, \] but the total angular momentum $\hat{\mathbf J}=\hat{\mathbf L}+\hat{\mathbf S}$ commutes with $H_D$. Thus \[ [\hat H_D, \hat J^i]=0, \] and $\mathbf J$ is a constant of motion. This also means that $J^i$ and $\hat H_D$ have common eigenstates. $j$ and $m_j$ are good quantum numbers. The form of the total angular momentum operator also shows that Dirac equation indeed describes spin-½ particles, as we have claimed.

However, now the problem is that we have added too many quantum numbers to describe just a two-fold degeneracy. Thus $\mathbf p$, $j$ and $m_j$ are not all independent of each other.

It turns out that we only need to consider helicity; the projection of the angular momentum along the momentum direction (in effect, we are choosing the quantization axis of $\hat{\mathbf J}$ along the particle motion): \[ \hat h = \frac{\hat{\mathbf J}\cdot\hat{\mathbf p}}{|\hat{\mathbf p}|}= \frac{\hat{\mathbf S}\cdot\hat{\mathbf p}}{|\hat{\mathbf p}|}, \] where the orbital angular momentum vanishes since $\hat{\mathbf L}\cdot\hat{\mathbf p} = 0$. Helicity can thus equally be defined as the projection of the spin along $\mathbf p$. The helicity operator has the same eigenvalues as the spin operator: $h=\pm1/2$.

The complete set of eigenstates of Dirac Hamiltonian can be described by momentum $\mathbf p$, helicity $h$ and the sign of energy $\alpha$. (Exercise: write down the plane wave solutions with quantum numbers $\mathbf p$, $h$ and $\alpha$.)

$\begin{tikzpicture}[scale=2] \draw [->,x=1mm,y=4mm,semithick,>=latex] (-2,0) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [line width=1.5mm, white] (-0.05,0.03) -- + (.1,0); \draw [line width=1.5mm, white] (-0.05,0.11) -- + (.1,0); \draw [->,semithick,>=latex] (-0.125,0.03) -- + (1.0,0) node[below] {$\mkern-50mu\vec p$}; \draw [->,very thick,>=latex] (-0.125,0.11) -- + (.5,0) node[above] {$\mkern-50mu\vec S$}; % \draw [<-,x=1mm,y=4mm,semithick,>=latex] (26,0) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [line width=1.5mm, white] (2.75,0.03) -- + (.1,0); \draw [->,semithick,>=latex] (2.67,0.03) -- + (1.0,0) node[below] {$\mkern-50mu\vec p$}; \draw [->,very thick,>=latex] (2.67,0.11) -- + (-.5,0) node[above] {$\mkern+50mu\vec S$}; % \node at (-0.9,0.4){\begin{tabular}{r l}a) &$E>0$\\&$h=+\frac 1 2$\end{tabular}}; \node at (1.5,0.4){\begin{tabular}{r l}b) &$E>0$\\&$h=-\frac 1 2$\end{tabular}}; % \draw [->,x=1mm,y=4mm,semithick,>=latex] (-2,-3) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [line width=1.5mm, white] (-0.05,-1.15) -- + (.1,0); \draw [line width=1.5mm, white] (-0.05,-1.07) -- + (.1,0); \draw [->,semithick,>=latex] (-0.125,-1.15) -- + (1.0,0) node[below] {$\mkern-50mu\vec p$}; \draw [->,very thick,>=latex] (-0.125,-1.07) -- + (.5,0) node[above] {$\mkern-50mu\vec S$}; % \draw [<-,x=1mm,y=4mm,semithick,>=latex] (26,-3) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [line width=1.5mm, white] (2.75,-1.15) -- + (.1,0); \draw [->,semithick,>=latex] (2.67,-1.15) -- + (1.0,0) node[below] {$\mkern-50mu\vec p$}; \draw [->,very thick,>=latex] (2.67,-1.07) -- + (-.5,0) node[above] {$\mkern+50mu\vec S$}; % \node at (-0.9,-.7){\begin{tabular}{r l}c) &$E<0$\\&$h=+\frac 1 2$\end{tabular}}; \node at (1.5,-.7){\begin{tabular}{r l}d) &$E<0$\\&$h=-\frac 1 2$\end{tabular}}; \end{tikzpicture}$

Fig: Eigenstates of the single-particle Dirac equation. The rotating arrows represent helicity/spin according to a right-hand rule.

Nonrelativistic limit

Note: In this section, we temporarily restore $c$ to the equations.

The non-relativistic limit is obtained when $|\mathbf p|^2 \ll m^2c^2$. At that limit, we can expand the dispersion as \[ E_{\mathbf p} = c\sqrt{|\mathbf p|^2 + m^2c^2} = mc^2\sqrt{1 + \frac{|\mathbf p|^2}{m^2c^2}} \approx mc^2 + \frac{|\mathbf p|^2}{2m} \]

Let us now consider the block matrix form of the Dirac equation \[ \begin{pmatrix} E-mc^2 & c\bm \sigma\cdot\hat{\mathbf p} \\ -c\bm\sigma\cdot\hat{\mathbf p} & E + mc^2 \end{pmatrix} \begin{pmatrix} \psi_A(\mathbf x) \\ \psi_B(\mathbf x) \end{pmatrix} =0. \] The lower component can be solved in terms of the upper component as \[ \psi_B = \frac{c \bm\sigma\cdot\hat{\mathbf p}}{E+mc^2}\psi_A \approx \frac{c \bm\sigma\cdot\hat{\mathbf p}}{2mc^2}\psi_A. \] The upper component obeys the equation \[ 0=(E-mc^2)\psi_A +c \bm\sigma\cdot\hat{\mathbf p}\psi_B\\ \qquad = (E-mc^2)\psi_A +\frac{c^2 (\bm\sigma\cdot\hat{\mathbf p})^2}{2mc^2}\psi_A \] By defining the non-relativistic energy $E_{\rm nr} = E-mc^2$ and noticing that $(\bm\sigma\cdot\hat{\mathbf p})^2 = \sum_{i,j=1}^3 \hat p_i\hat p_j \sigma^i \sigma^j = \hat{\mathbf p}^2$, we obtain the Schrödinger equation for a spin-½ particle: \[ \frac{\hat{\mathbf p}^2}{2m} \psi_A(\mathbf x) = E_{\rm nr} \psi_A(\mathbf x), \] The lower component vanishes in the non-relativistic limit: \[ \psi_B = \frac{\bm\sigma\cdot\hat{\mathbf p}}{2mc}\psi_A = \mathcal O (v/c). \] We conclude that in the non-relativistic limit and in the Dirac-Pauli representation, the upper components of the Dirac spinor describe spin-½ particles such as electrons.

Existence of antiparticles and the many-body viewpoint

The Dirac equation has paradoxical properties when viewed as a single particle equation. A many-particle viewpoint (second quantization) is needed to resolve these issues.

In the many-body part of the course we saw how --- given a single particle Hamiltonian --- to transition from a single particle Hilbert space to a many-body Fock space. Here we do exactly the same. We find out that the solution to how to deal with the negative energy states is then the same as with the Fermi gas and its particle/hole excitations.

The Fock space Dirac Hamiltonian in position basis is \[ H_D(\mathbf x) = \int{\rm d}^3\mathbf x\;\hat\psi^\dagger(\mathbf x) \left( -i\bm\alpha\cdot\nabla -\beta m\right) \hat\psi(\mathbf x), \] where $\hat\psi(\mathbf x)$ is a 4-component column vector \[ \hat \psi(\mathbf x) = (\hat\psi_1(\mathbf x), \hat\psi_2(\mathbf x),\hat\psi_3(\mathbf x),\hat\psi_4(\mathbf x))^\intercal, \] and the field operators $\hat \psi^\dagger_s(\mathbf x)$ create a particle at $\mathbf x$ with a spinorial index $s$, associated with particle/antiparticle and spin degrees of freedom. For example, $\hat\psi_1(\mathbf y)$ creates a particle with a wavefunction $\psi(\mathbf x) = (1,0,0,0)^\intercal\sqrt{\delta(\mathbf x-\mathbf y)}$, where the square root should be understood via some limit representation of the delta function.

The operators $\hat \psi_s(\mathbf x)$ obey the anticommutation relations \[ \begin{aligned} \{\hat\psi_r(\mathbf x),\hat\psi_s^\dagger(\mathbf y)\} &=\delta(\mathbf x-\mathbf y)\delta_{rs},\\ \{\hat\psi_r(\mathbf x),\hat\psi_s(\mathbf y)\} &= \{\hat\psi_r^\dagger(\mathbf x),\hat\psi_s^\dagger(\mathbf y)\} = 0. \end{aligned} \]

For reference, see Nazarov and Danon

— 09 Apr 20

We have in effect already diagonalized the many-particle Dirac Hamiltonian, as we have found the full set of single-particle eigensolutions. For a given momentum $\mathbf p$, we have two positive energy solutions, and two negative energy solutions with helicities $h=\pm 1/2$. The collapsible below contains some details of the diagonalization.

Diagonalization of the many-body Hamiltonian

By making a Fourier transform, we can re-express the Hamiltonian as \[ \begin{aligned} H_D %&= \psi^\dagger(\mathbf x) (-i\bm\alpha\cdot\nabla+\beta m)\psi(\mathbf x)\\ &= \sum_{\mathbf p} \hat d_{\mathbf p}^\dagger ( \mathbf p\cdot\bm\alpha+\beta m ) \hat d_{\mathbf p}^{\phantom{\dagger}}, \end{aligned} \] where $d$'s are the Fourier transforms of the field operator spinors \[ d_{\mathbf p} = \frac{1}{\sqrt{V}}\int{\rm d}^3\mathbf x\;\hat\psi(\mathbf x) e^{i\mathbf p\cdot\mathbf x}. \] These are not yet the annihilation operators for the eigenstates, as we still have the $4\times 4$ matrix to diagonalize. In terms of the eigenspinors $u^s(\mathbf p)$, $s\in\{1,2,3,4\}$, we have \[ \begin{aligned} \mathbf p\cdot\bm\alpha+\beta m %&= \frac\lambda 2 \sum_{s=1}^4 \left[ u^s(\mathbf p)u^s(\mathbf p)^\dagger-v^s(-\mathbf p)v^s(-\mathbf p)^\dagger\right]\\ &= \sum_{s=1}^4 E_{\mathbf p s} \frac{\lambda}{2E_{\mathbf p}}u^s(\mathbf p) u^s(\mathbf p)^\dagger, \end{aligned} \] where $E_{\mathbf ps} = E_{\mathbf p}$ for $s\in\{1,2\}$ and $E_{\mathbf ps} = -E_{\mathbf p}$ for $s\in\{3,4\}$. Here we choose to label the states by index $s$ instead of helicity $h$, because of a more straightforward notation. We see that the Hamiltonian is diagonalized by the operators \[ \begin{aligned} \hat a_{\mathbf ps} &\equiv \left(\frac{\lambda}{2E_{\mathbf p}}\right)^{1/2} u^s(\mathbf p)^\dagger d_{\mathbf p} = \left(\frac{\lambda}{2E_{\mathbf p}}\right)^{1/2}\frac{1}{\sqrt{V}}\int{\rm d}^3\mathbf x\;u^s(\mathbf p)^\dagger\hat\psi(\mathbf x) e^{i\mathbf p\cdot\mathbf x}, \end{aligned} \] with $s\in\{1,2,3,4\}$. For convenience, we also define the operators $c_{\mathbf p 1}= a_{\mathbf p 3}$ and $c_{\mathbf p 2}= a_{\mathbf p 4}$.

In terms of the eigenstates and their energies $\pm E_{\mathbf p}$, the Fock space Hamiltonian is \[ H_D = \sum_{\mathbf p}\sum_{s=1,2} E_{\mathbf p}^{\phantom{\dagger}} \left(a^\dagger_{\mathbf p s} a_{\mathbf p s}^{\phantom{\dagger}}-c^\dagger_{\mathbf p s} c_{\mathbf p s}^{\phantom{\dagger}}\right). \] The annihilation and creation operators $a_\mu, a^\dagger_\mu$ (for positive energy solutions) and $c_\mu,c^\dagger_\mu$ (for negative energy solutions) obey the anticommutation relations \[ \begin{aligned} \{ a_{\mathbf p s}^{\phantom{\dagger}}, a^\dagger_{\mathbf p's'} \} &= \{ c_{\mathbf p s}^{\phantom{\dagger}}, c^\dagger_{\mathbf p's'} \} = \delta_{\mathbf p\mathbf p'}\delta_{ss'},\\ \{ a_{\mathbf p s}^{\phantom{\dagger}}, a_{\mathbf p' s'}^{\phantom{\dagger}} \} &= \{ c_{\mathbf p s}^{\phantom{\dagger}}, c_{\mathbf p' s'}^{\phantom{\dagger}} \} = 0,\\ \{ a_{\mathbf p s}^{\phantom{\dagger}}, c_{\mathbf p' s'}^{\phantom{\dagger}} \} &= \{ a_{\mathbf p s}^{\phantom{\dagger}}, c_{\mathbf p' s'}^{\dagger} \}= 0. \end{aligned} \] The correspondence between the spinor solutions and the creation operators is \[ \begin{aligned} a_{\mathbf p 1}^\dagger &\leftrightarrow u_1(\mathbf p),\quad &a_{\mathbf p 2}^\dagger \leftrightarrow u_2(\mathbf p),\\ c_{\mathbf p 1}^\dagger &\leftrightarrow u_3(\mathbf p),\quad &c_{\mathbf p 2}^\dagger \leftrightarrow u_4(\mathbf p). \end{aligned} \]

What is the ground state of this Hamiltonian? We can lower the energy by creating "negative-energy particles", so the ground state must be the one in which all the negative energy states are filled \[ |\nu\rangle = \prod_{\mathbf p,s} c^\dagger_{\mathbf ps}|0\rangle, \] where $|0\rangle$ is the state quenched by all the annihilation operators: $a_{\mu} |0\rangle=c_{\mu} |0\rangle=0$. This state is known as the Dirac sea. It has the unpleasant property that the annihilation operator applied to the vacuum $c_\mu|\nu\rangle$ no longer gives zero. In fact, applying $c_{\mathbf p s}$ on the Dirac sea creates a hole, so that the energy of the system increases by $E_\mathbf p$, charge decreases by $q$, momentum decreases by $\mathbf p$ and angular momentum decreases by $\frac 1 2 \sigma$ (in some direction). (Ex. Derive the total momentum and total angular momentum of operators according to the prescription given in the many-body part of the course and verify these claims!)

$\begin{tikzpicture} \begin{axis}[ axis x line=middle, axis y line=middle, xlabel = $p_x/mc$, ylabel = {$E/mc^2$}, ] \addplot [ domain=-4:4, samples=100, color=blue, style={ultra thick} ] {-sqrt(x^2+1)}; \addplot [ domain=-4:4, samples=100, color=blue, ] {sqrt(x^2+1)}; \node[circle,draw=blue, fill=white, inner sep=0pt,minimum size=5pt] (a) at (axis cs:2,-2.2360679774) {}; \node[circle,draw=blue, fill=blue, inner sep=0pt,minimum size=4pt] (b) at (axis cs:2,2.2360679774) {}; \path[blue,dashed,semithick,->,shorten >=2pt,shorten <=3pt,>=latex](a)edge [bend right=15] node[below left]{}(b); %\draw[->,semithick, LL, segment length=3mm,>=latex] (axis cs: 3.9,0) -- (axis cs: 2.3,-1.8); \end{axis} \end{tikzpicture}$

Fig: Charge conserving excitation of the Dirac sea. The thin/thick line denotes the empty/filled states. When a particle excitation is created, a negative energy hole is left behind. Minimum energy cost of this process is 2mc^2 .

We can interpret this as a creation of an antiparticle with energy $E_{\mathbf p}$, momentum $-\mathbf p$, spin $-\frac 1 2 \sigma$ and charge $-q$. Let us define a new set of operators,

\[ \begin{aligned} b_{\mathbf p 1}^{\phantom{\dagger}}&=c_{-\mathbf p2}^\dagger,\\ b_{\mathbf p 2}^{\phantom{\dagger}}&=c_{-\mathbf p1}^\dagger, \end{aligned} \] where both the spin and the momentum of the negative energy state are inverted. As their product, the helicity is not. The new operators obey the same anticommutation relations as the $c$-operators: \[ \begin{aligned} \{ b_{\mathbf p s}^{\phantom{\dagger}}, b^\dagger_{\mathbf p' s'} \} &= \delta_{\mathbf p\mathbf p'}\delta_{ss'},\\ \{ a_{\mathbf p s}^{\phantom{\dagger}}, b_{\mathbf p's'}^{\phantom{\dagger}} \} &= \{a_{\mathbf p s }^\dagger, b_{\mathbf p's'}^{\phantom{\dagger}} \} = 0. \end{aligned} \] The inverse transformation from momentum space operators back to the field operators is \[ \hat\psi(\mathbf x) = \frac{1}{(2\pi)^3}\frac 1 {\sqrt V}\sum_{\mathbf p} \left(\frac{\lambda}{2E_{\mathbf p}}\right)^{\mkern-7mu 1/2}\sum_{s=1,2} \left[ u^s(\mathbf p) e^{-i\mathbf p\cdot\mathbf x}a^{\phantom{\dagger}}_{\mathbf ps} + v^s(\mathbf p) e^{i\mathbf p\cdot\mathbf x} b^\dagger_{\mathbf ps}\right]. \] This is an example of a Bogoliubov transformation, which we first encountered on this course when diagonalizing the superconducting Bogoliubov-de Gennes Hamiltonian.

By rearranging the hole part \[ -E_{\mathbf p}^{\phantom{\dagger}} c^\dagger_{\mathbf p s} c_{\mathbf p s}^{\phantom{\dagger}} = -E_{-\mathbf p}^{\phantom{\dagger}} b^{\phantom{\dagger}}_{-\mathbf p s} b_{-\mathbf p s}^{{\dagger}} = E_{-\mathbf p}^{\phantom{\dagger}} (b_{-\mathbf p s}^\dagger b^{\phantom{\dagger}}_{-\mathbf p s}-1). \] Making a change of variables $\mathbf p\rightarrow -\mathbf p$ under the integral, we get the Hamiltonian \[ H_D = \sum_\mathbf p \sum_{s=1,2}E_{\mathbf p}^{\phantom{\dagger}} \left(a^\dagger_{\mathbf p s} a_{\mathbf p s}^{\phantom{\dagger}}+b^\dagger_{\mathbf p s} b_{\mathbf p s}^{\phantom{\dagger}} \right)+ E_0, \] with only positive energy excitations. $E_0 = - \sum_{\mathbf ps}E_{\mathbf p}$ is an (infinite) constant energy shift, which does not affect the dynamics. With only positive energy excitations, the ground state $|0\rangle$ can be defined in the usual way as the state which is is quenched by all the annihilation operators: \[ a_{\mathbf ps}|0\rangle = b_{\mathbf ps}|0\rangle = 0. \]

The fermionic nature of the particles appears when we choose the anticommutation relations instead of commutation relations. The reason for the choice is simply that with bosonic operators we cannot invert the sign of antiparticle energies and consequently we are not able to define a proper ground state, i.e. with bosons, there is no Pauli repulsion and the Dirac sea cannot be filled.

Dirac predicted the existence of antiparticles in 1928, and positron, the antiparticle of electron, was discovered in 1933 by C.P. Anderson in cosmic radiation. As all the quantum numbers are inverted for the antiparticle, no conservation law (e.g. charge conservation) forbids the process in which particle and antiparticle annihilate each other, or the inverse one in which a particle-antiparticle pair pop into existence, as long as enough energy is provided.

$\begin{tikzpicture}[scale=2] \draw [->,x=1mm,y=4mm,semithick,>=latex] (-2,0) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [line width=1.5mm, white] (-0.05,0.03) -- + (.1,0); \draw [line width=1.5mm, white] (-0.05,0.11) -- + (.1,0); \draw [->,semithick,>=latex] (-0.125,0.03) -- + (1.0,0) node[below] {$\mkern-50mu\vec p$} node[right] {$e^-$}; \draw [->,very thick,>=latex] (-0.125,0.11) -- + (.5,0) node[above] {$\mkern-50mu\vec S$}; % \draw [<-,x=1mm,y=4mm,semithick,>=latex] (26,0) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [line width=1.5mm, white] (2.75,0.03) -- + (.1,0); \draw [->,semithick,>=latex] (2.67,0.03) -- + (1.0,0) node[below] {$\mkern-50mu\vec p$} node[right] {$e^-$}; \draw [->,very thick,>=latex] (2.67,0.11) -- + (-.5,0) node[above] {$\mkern+50mu\vec S$}; % \node at (-0.9,0.4){a) $h=+\frac 1 2$}; \node at (1.5,0.4){b) $h=-\frac 1 2$}; % \draw [<-,x=1mm,y=4mm,semithick,>=latex] (4,-3) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [->,semithick,>=latex] (0.5,-1.15) -- + (-1.0,0) node[below] {$\mkern+50mu\vec p$} node[left] {$e^+$}; \draw [->,very thick,>=latex] (0.5,-1.07) -- + (-.5,0) node[above] {$\mkern+50mu\vec S$}; % \draw [->,x=1mm,y=4mm,semithick,>=latex] (30,-3) arc (-165:165:1 and 1);% node[midway, below right] {$\sigma$}; \draw [line width=1.5mm, white] (3.15,-1.07) -- + (.1,0); \draw [->,semithick,>=latex] (3.07,-1.15) -- + (-1.0,0) node[below] {$\mkern+50mu\vec p$} node[left] {$e^+$}; \draw [->,very thick,>=latex] (3.07,-1.07) -- + (.5,0) node[above] {$\mkern+50mu\vec S$}; % \node at (-0.9,-.7){c) $h=+\frac 1 2$}; \node at (1.5,-.7){d) $h=-\frac 1 2$}; \end{tikzpicture}$

Fig: Particle/antiparticle excitations of the many-body Dirac Hamiltonian. The rotating arrows represent helicity/spin according to a right-hand rule.

Apart from the negative energy solutions, there is also a more subtle reason to reject Dirac equation as a single-particle equation. In relativity, we can associate mass with energy, and consequently mass can be created. Furthermore, the Heisenberg uncertainty relation $\Delta E \Delta t \geq \frac 1 2$ is not violated for a creation of particle-antiparticle pair out of vacuum, if that pair exists for a short enough time. The relativistic vacuum is then a highly dynamic entity, and can be pictured as a boiling sea of particle-antiparticle pairs appearing and disappearing all the time and interacting with each other. For this reason a relativistic theory cannot exist within a Hilbert space $\mathcal H_N$ with a fixed particle number $N$.

A consequence of this is that the expectation value $\langle 0 | \hat\psi(\mathbf x_1,t)\hat\psi^\dagger(\mathbf x_2,t) |0\rangle$ for a space-like separation is non-zero (proportional to $e^{-m|\mathbf x_1-\mathbf x_2|}$). In single-particle picture it would seem as if the particle can travel faster than light. In many-body picture the effect can be interpreted as an entanglement of the vacuum. Relativity is not violated as the above correlation is of the Einstein-Podolsky-Rosen type and cannot transmit any information.

Weyl fermions and chirality

Let us define the "fifth $\gamma$-matrix" as \[ \gamma^5 = i\gamma^0\gamma^1\gamma^2\gamma^3. \] where the naming convention dates back to a time when $\gamma^0$ was called $\gamma^4$. The matrix $\gamma^5$ can be used to define the chirality or handedness of spinors.

Let us use $\gamma^5$ to construct two complementary subspaces of spinors. The eigenvalues of $\gamma^5$ are $\pm1$, so the projection operators to these subspaces are \[ L = \frac{1-\gamma^5}{2},\quad R = \frac{1+\gamma^5}{2}, \] As projection operators, they satisfy the relations \[L+R = 1,\quad L^2=L \quad \text{and}\quad R^2=R. \] Some nomenclature: The 4-component spinors $\psi$ we have been dealing with so far are called Dirac spinors. The 2-component spinors we obtain by using the projection operators on Dirac spinors, $\psi_L = L \psi$ and $\psi_R = R \psi$, are left-handed and right-handed Weyl spinors, respectively.

Transformations of $\psi_L$ and $\psi_R$ are most easily discussed in the Weyl basis, in which the $\gamma$-matrices are \[ \gamma^0= \begin{pmatrix} 0& I\\ I&0 \end{pmatrix}, \quad\gamma^i= \begin{pmatrix} 0 & \sigma^i\\ -\sigma^i & 0 \end{pmatrix},\quad \gamma^5 = \begin{pmatrix} -I & 0\\ 0 & I \end{pmatrix} \] By defining the "four-vectors" $\sigma^\mu = (I,\bm\sigma)$ and $\bar\sigma^\mu=(I,-\bm\sigma)$, the $\gamma^\mu$'s can be described with one expression \[ \gamma^\mu = \begin{pmatrix} 0& \sigma^\mu\\ \bar\sigma^\mu & 0 \end{pmatrix}. \]

In Weyl basis, the chiral projection operators are \[ L = \begin{pmatrix} I & 0\\ 0 & 0 \end{pmatrix}, \quad R = \begin{pmatrix} 0 & 0\\ 0 & I \end{pmatrix}, \] thus, in this basis, the upper and lower components of a Dirac spinor are the left-handed and right-handed projection of $\psi$, respectively: \[ \psi = \begin{pmatrix} \psi_L\\ \psi_R \end{pmatrix} \]

The fundamental difference which separates the left-handed and right-handed Weyl fermions from each other is that they belong to different representations of the Lorentz group. In other words, under the proper Lorentz transformations, they transform among themselves, but in different ways. In Weyl basis, they transform under rotation of angle $\theta$ around the axis $\mathbf n$ the same way, \[ \psi_{L} \rightarrow S_{L}(\Lambda)\psi_{L}=\exp(i\theta\mathbf n\cdot\bm\sigma/2)\psi_{L},\\ \psi_{R} \rightarrow S_{R}(\Lambda)\psi_{R}=\exp(i\theta\mathbf n\cdot\bm\sigma/2)\psi_{R},\\ \] but under boost of rapidity $\xi$ to direction $\mathbf n$, there is a sign difference: \[ \psi_L \rightarrow S_L(\Lambda)\psi_L=\exp(+\xi\mathbf n\cdot\bm\sigma/2)\psi_L,\\ \psi_R \rightarrow S_R(\Lambda)\psi_R=\exp(-\xi\mathbf n\cdot\bm\sigma/2)\psi_R. \] In Weyl basis, the Dirac spinor transforms block-diagonally as \[ \psi \rightarrow S(\Lambda)\psi = \begin{pmatrix} S_L(\Lambda) & 0\\ 0 & S_R(\Lambda) \end{pmatrix} \begin{pmatrix} \psi_L\\ \psi_R \end{pmatrix}. \]

Let us now consider a parity transformation \[ x^\mu\rightarrow x'^\mu= \Pi^\mu{}_{\nu}x^\nu=(t,-\mathbf x)^\intercal, \] which inverts the sign of all the spatial coordinates. As $\det \Pi=-1$, this transformation is not part of the proper Lorentz group. The corresponding spinor transformation matrix commutes with $\gamma^0$, but anticommutes with $\gamma^i$'s. Thus, it can be chosen as \[ S(\Pi) = \gamma^0 = \begin{pmatrix} 0 & I \\ I & 0\end{pmatrix}, \] and the wave function transforms as \[ \psi(\mathbf x,t) \rightarrow \psi'(-\mathbf x,t)=\gamma^0 \psi(-\mathbf x,t). \] Above, the explicit matrix form of $\gamma^0$ is given in the Weyl basis. In this basis, we easily see that the parity transformation mixes the left-handed and right-handed subspaces. If parity is a symmetry of the system, both left-handed and right-handed fermions must exist, which is of course suggested by the left/right nomenclature.

The Dirac equation can be written in terms of $\psi_L$ and $\psi_R$ as \[ i\sigma^\mu\partial_\mu\psi_R - m \psi_L=0,\\ i\bar\sigma^\mu\partial_\mu\psi_L - m \psi_R=0. \] The mass $m$ now obtains a new interpretation; it is the coupling constant between the two fields. The energy gap between the particles and antiparticles is then another example of an avoided crossing. If $m=0$ the fields do not couple and we obtain the Weyl equations \[ \sigma^\mu\partial_\mu\psi_{R}=0.\\ \bar\sigma^\mu\partial_\mu\psi_{L}=0. \] It used to be thought that neutrinos, which are spin-½ fermions, could be described by the Weyl equation. But neutrino oscillations suggest that neutrinos do have mass, so this leaves the Weyl equations without a realization in particle physics. In condensed matter, they do find a realization in the effective theory of Weyl semimetals. Even so, in the standard model, the Weyl fermions are -- in some sense -- more fundamental than Dirac fermions. In the electroweak part of the theory, there are interactions which only couple the left-handed particles. The corresponding interactions for the right-handed particles are absent from the theory. The parity symmetry is thus broken, and the standard model is naturally expressed in terms of Weyl fermions.

Above, we defined helicity as the projection of spin along the momentum direction, which is one way of defining handedness. Does helicity then have some connection to chirality, which we also associate with some kind of handedness? The answer is that for massive particles, the two concepts are not the same: for example, chirality is Lorentz invariant, whereas helicity depends on the frame. Massive particles have velocity $|\mathbf v|<c$, so for them helicity can be inverted by a Lorentz boost which inverts the direction of momentum.

For massless particles, on the other hand, chirality and helicity coincide. The Weyl fermions are massless, so their states can also be described in terms of helicity. The left handed Weyl fermions (particles) have chirality -1 and helicity +1/2, which means that their spin is always antiparallel to momentum. Correspondingly, the spin of right handed Weyl fermions is parallel to momentum. The antiparticles of left and right handed particles have their spin parallel and antiparallel to momentum, respectively, i.e. just the opposite way as the particles.

Dirac equation with classical electromagnetic field

Note: Here we again restore $c$ to the equations.

How does the Dirac equation work in the presence of a classical electromagnetic field? In particular, we are interested in the relativistic corrections to the Pauli equation. Let us first derive the Pauli equation from the Dirac equation.

Covariant formulation of Maxwell's equations

Maxwell's equations (with source terms) are \[ \begin{aligned} \nabla\cdot\mathbf E &= \rho/\varepsilon_0,\\ \nabla\cdot\mathbf B &= 0,\\ \nabla\times\mathbf E &= \frac{\partial \mathbf B}{\partial t},\\ \nabla\times\mathbf B &= \mu_0 \mathbf j + \frac{1}{c^2} \frac{\partial \mathbf E}{\partial t}. \end{aligned} \] The second and third equation are equivalent to a statement that $\mathbf E$ and $\mathbf B$ can be expressed in terms of a scalar potential $\varphi$ and a vector potential $\mathbf A$ as \[ \begin{aligned} \mathbf B &= \nabla\times\mathbf A,\\ \mathbf E &= -\nabla \varphi - \frac{\partial \mathbf A}{\partial t}. \end{aligned} \]

In relativistic formulations of the electromagnetic field, the scalar and vector potentials are collected to a four-potential \[ A^\mu = \begin{pmatrix} \varphi/c \\ \mathbf A\end{pmatrix}. \] The field strength tensor for the EM field is \[ F^{\mu\nu} = \partial^\mu A^\nu - \partial^\nu A^\mu \] We note that $F^{\mu\nu}=-F^{\nu\mu}$. The components are \[ \begin{aligned} F^{\mu\mu} &= -F^{\mu\mu} = 0,\\ F^{0i} &= \partial^0 A^i-\partial^iA^0 = \partial_0 A^i+\partial_iA^0\\ &= \frac 1 c \frac{\partial A^i}{\partial t} + \frac 1 c \frac{\partial \varphi}{\partial x^i} = - \frac 1 c E_i,\\ F^{12} &= \partial^1 A^2 - \partial^2 A^1 = -\partial_1 A^2 + \partial_2 A^1 = -B_z\\ F^{23} &= -B_x,\quad F^{31} = -B_y, \end{aligned} \] where we used Maxwell's second and third equations. \[ F^{\mu\nu} = \begin{pmatrix} \phantom{+}0 & -\frac 1 c E_x & -\frac 1 c E_y & -\frac 1 c E_z\\ \frac 1 c E_x & \phantom{+}0 & -B_z & \phantom{+}B_y\\ \frac 1 c E_y & \phantom{+}B_z & \phantom{+}0 & -B_x\\ \frac 1 c E_z & -B_y & \phantom{+}B_x & \phantom{+}0 \end{pmatrix} \] With the electric current four-vector \[ j^\mu = \begin{pmatrix} c \rho \\ \mathbf j \end{pmatrix} \] Maxwell's first and fourth equations are expressed simply as \[ \partial_{\mu} F^{\mu\nu} = \mu_0 j^\nu. \] The proof is left as an exercise. In the derivation, the relation between the speed of light, vacuum permeability and the electric constant is needed: $c=1/\sqrt{\varepsilon_0 \mu_0}$.

Pauli equation

Let us do to the Dirac equation the minimal substitution, \[ \begin{aligned} \hat{\mathbf p} &\rightarrow \hat{\mathbf p}-q \mathbf A(\hat{\mathbf x},t),\\ \hat H &\rightarrow \hat H + q\varphi(\hat{\mathbf x},t), \end{aligned} \] where $A$ is the electromagnetic vector potential, $\varphi$ is the scalar potential and $q$ is the charge of the particle. Note that $\hat{\mathbf x}$ and consequently $\hat \varphi=\varphi(\hat{\mathbf{x}})$ and $\hat{\mathbf A}=\mathbf A(\hat{\mathbf{x}})$ are operators which do not commute with the momentum operator $\hat{\mathbf p}$.

For the electron $q=-e$, where $e>0$ is the elementary charge. The Dirac Hamiltonian transform from $\hat H_D = \bm\alpha\cdot\hat{\mathbf p}+\beta m$ to \[ \hat H_D = \bm\alpha\cdot\left[\hat{\mathbf p}-q \mathbf A(\hat{\mathbf x},t)\right]+\beta m + q\varphi(\hat{\mathbf x},t). \]

We would like to derive a generalization of the Schrödinger equation with the correct coupling to a classical electromagnetic field. The derivation is similar to the one that we did above in deriving the Schrödinger equation, now only a little more involved due to electromagnetic potentials.

Let us do the minimal substitution to the the Dirac equation: \[ \begin{aligned} \hat{\mathbf p}&\rightarrow \hat{\mathbf p}-q\mathbf A(\hat{\mathbf x},t),\\ \hat{H}&\rightarrow \hat{H}+q\mathbf \varphi(\hat{\mathbf x},t),\\ \end{aligned} \] where $q$ is the charge of the particle (for electron $q=-e$), $\mathbf A$ is the vector potential and $\varphi$ is the scalar potential. The Dirac Hamiltonian becomes \[ \hat H_D = c\bm\alpha\cdot\left[\hat{\mathbf p}-q\mathbf A(\hat{\mathbf x},t)\right]+\beta mc^2+q\varphi(\hat{\mathbf x},t). \] Notice that both the upper (particle) and lower (antiparticle) blocks couple to the EM field with the same charge $+q$. But from the many-particle Hamiltonian we saw that the lower block actually describes an absence of an antiparticle (double negation), which suggests that the charge of the antiparticle must be $-q$.

Covariant formulation of the minimal substitution

The covariant form of the minimal substitution is $\hat p^\mu \rightarrow \hat p^\mu -q A^\mu$, where $A^\mu = (\varphi,\mathbf A)$. \[ \left[ \gamma^\mu(i\partial_\mu-qA_\mu)-mc\right]\psi =0. \]

Let us consider a case in which $\varphi$ and $\mathbf A$ are time-independent. The stationary Dirac equation in the position basis, $\hat H_D \psi(\mathbf x) = E\psi(\mathbf x)$, can be written in the Dirac-Pauli basis as two coupled equations for the 2-component spinors: \[ \begin{aligned} c\bm\sigma\cdot\left[\hat{\mathbf p}-q\mathbf A(\mathbf x)\right]\psi_B(\mathbf x) &= \left[E-mc^2-q\varphi(\mathbf x)\right]\psi_A(\mathbf x), \\ c\bm\sigma\cdot\left[\hat{\mathbf p}-q\mathbf A(\mathbf x)\right]\psi_A(\mathbf x) &= \left[E + mc^2-q\varphi(\mathbf x)\right]\psi_B(\mathbf x). \end{aligned} \] Let us introduce the canonical momentum operator $\hat{\bm\pi} = \hat{\mathbf p}-q \mathbf A(\hat{\mathbf x})$. The B-component can be solved as \[ \psi_B(\mathbf x) = \frac c {E+mc^2-q\varphi} (\bm\sigma\cdot\hat{\bm\pi})\psi_A(\mathbf x). \] Again $\psi_B$ is of the order $\mathcal O(v/c)$. Substituting this into the upper equation, we find an equation for $\psi_A$ \[ \frac{1}{2m}(\bm\sigma\cdot\hat{\bm\pi}) K(\mathbf x) (\bm\sigma\cdot\hat{\bm\pi})\psi_A(\mathbf x) = \left(E'-q\varphi\right)\psi_A(\mathbf x), \] where $E'=E-mc^2$ and \[ K(\mathbf x) = \frac{2mc^2}{E'+2mc^2-q\varphi(\mathbf x)} = \left(1+\frac{E'-q\varphi(\mathbf x)}{2mc^2}\right)^{-1}. \]

Note: Assuming time-independence of $A$ and $\varphi$ breaks the full gauge invariance.

— 20 May 20

We are interested in the non-relativistic limit $E'-q\varphi(\mathbf x)\ll 2mc^2$, in which we can expand $K$ as \[ K(\mathbf x) = 1 - \frac{E'-q\varphi(\mathbf x)}{2mc^2} + \dots \] Let us first consider the lowest order term, for which $K\approx 1$. We then need to evaluate \[ \begin{aligned} %(\bm\sigma\cdot\hat{\mathbf \pi})1(\bm\sigma\cdot\hat{\mathbf \pi})= (\bm\sigma\cdot\hat{\mathbf \pi})^2 &= \hat{\bm\pi}\cdot\hat{\bm\pi} + i\bm\sigma\cdot(\hat{\bm\pi}\times\hat{\bm\pi})\\ &= \left[\hat{\mathbf p}-q \mathbf A\right]^2 + i\bm\sigma\cdot\left[(\hat{\mathbf p}-q \mathbf A)\times(\hat{\mathbf p}-q \mathbf A)\right]\\ &= \left[\hat{\mathbf p}-q \mathbf A\right]^2 - q \bm\sigma\cdot(\nabla\times \mathbf A)\\ &= \left[\hat{\mathbf p}-q \mathbf A\right]^2 - q \bm\sigma\cdot\mathbf B, \end{aligned} \] where we used Pauli matrix identities to expand the square. Thus, at the non-relativistic limit, we obtain the Pauli equation: \[ \left[ \frac{(\hat{\mathbf p}-q\mathbf A)^2}{2m} -\frac{q}{2m} \bm\sigma\cdot \mathbf B+q\varphi\right] \psi(\mathbf x) = E \psi(\mathbf x), \] where we changed the notation so that non-relativistic energy is $E$ and the 2-component spinor $\psi_A\rightarrow \psi$. The Pauli equation was originally formulated on phenomenological grounds by Wolfgang Pauli in 1927, before the discovery of the Dirac equation.

We should also verify that the wavefunction is correctly normalized up to a term of order $\mathcal O(v^2/c^2)$: \[ \begin{aligned} 1 &= \int{\rm d}^3\mathbf x\; \psi^\dagger(\mathbf x) \psi(\mathbf x) = \int{\rm d}^3\mathbf x \left[ \psi_A^\dagger(\mathbf x) \psi_A(\mathbf x) + \psi_B^\dagger(\mathbf x) \psi_B(\mathbf x)\right]\\ &= \int{\rm d}^3\mathbf x\; \psi_A^\dagger(\mathbf x) \psi_A(\mathbf x) + \mathcal O\left(\frac{v^2}{c^2}\right). \end{aligned} \]

The Zeeman term \[ -\frac{q}{2m}\bm\sigma\cdot\mathbf B \equiv \bm\mu\cdot\mathbf B \] is the interaction energy of the magnetic dipole moment of a spin-½ particle with a magnetic field. The magnetic dipole moment operator can be written in the alternative forms: \[ \bm\mu = \frac{q}{2m}\bm\sigma = \frac{q}{m}\mathbf S = g \frac{q}{2m}\mathbf S, \] the last of which contains $g$, the gyromagnetic ratio, i.e. the ratio between the magnetic moment $\bm\mu$ and the spin $\mathbf S$. The minimally substituted Dirac equation predicts that the electron has $g_e=2$. The deviation $g_e-2$ is known as the anomalous magnetic moment. It is related to vertex corrections due to the electron's interaction with virtual photons, and can be calculated using quantum electrodynamics. The lowest order correction was found by Julian Schwinger in 1948: \[g_e =2 + \frac{\alpha}{\pi} + \dots,\] where $\alpha$ is the fine-structure constant. Experimentally, \[ g_e = 2.00231930436182\pm0.00000000000052. \] The electron magnetic moment is one of the most precisely measured quantities in nature, and also a great success for quantum electrodynamics; the theory and the experiments agree up to 10 digits!

Relativistic corrections to the Pauli equation

Let us then find relativistic corrections of the order $\mathcal O(v^2/c^2)$ to the Pauli equation. Relativistic effects give the fine structure of the hydrogen spectrum. They are even more important in the description of the heavy elements in the periodic table, in which the electrons feel strong potentials (relativistic quantum chemistry). The color of gold is one example: without relativistic effects, gold would have a silvery color like most of the other metals.

To find all the corrections, we need to figure out the corrections both to the wavefunction-spinor and to the Hamiltonian $\hat H_{\rm Pauli}$. One correction arises from the normalization, another from the second order expansion for $K(\mathbf x)$

We look for a non-relativistic 2-component spinor $\psi_{\rm nr}$ which satisfies the normalization condition \[ 1 = \int{\rm d}^3\mathbf x\; \psi_{\rm nr}^\dagger(\mathbf x)\psi_{\rm nr}(\mathbf x) + \mathcal O\left( \frac{v^4}{c^4}\right). \] The upper component alone does not qualify at this order since there is a contribution from the lower component: \[ \begin{aligned} 1 &= \int{\rm d}^3\mathbf x\; \psi^\dagger(\mathbf x)\psi(\mathbf x)\\ &= \int{\rm d}^3\mathbf x \left[\psi_A^\dagger(\mathbf x)\psi_A^{\vphantom{\dagger}}(\mathbf x)+\psi_B^\dagger(\mathbf x)\psi_B^{\vphantom{\dagger}}(\mathbf x)\right]\\ &= \int{\rm d}^3\mathbf x\; \psi_A^\dagger(\mathbf x)\left[1 + \frac{(\bm\sigma\cdot\bm\pi)^2}{(2mc)^2}\right]\psi_A^{\vphantom{\dagger}}(\mathbf x) + \mathcal O\left( \frac{v^4}{c^4}\right). \end{aligned} \] However, we can choose \[ \psi_{\rm nr} = \Omega \psi_A,\quad \text{with}\quad \Omega = 1 + \frac{(\bm\sigma\cdot\bm\pi)^2}{8m^2c^2}, \] as our non-relativistic spinor.

The equation of motion for $\psi_{\rm nr}$ can be found from the equation for $\psi_A$ obtained in the previous section. To use it, we invert the above equation, \[ \psi_A = \Omega^{-1} \psi_{\rm nr} = \left( 1 - \frac{(\bm\sigma\cdot\bm\pi)^2}{8m^2c^2}\right)\psi_{\rm nr} + \mathcal O\left( \frac{v^4}{c^4}\right), \] and substitute into the equation of motion (taking only the terms up to order $v^2/c^2$): \[ \frac{1}{2m}(\bm\sigma\cdot\hat{\bm\pi}) \left( 1 - \frac{E'-q\varphi}{2mc^2}\right) (\bm\sigma\cdot\hat{\bm\pi})\Omega^{-1}\psi_{\rm nr}(\mathbf x) = \left(E'-q\varphi\right)\Omega^{-1}\psi_{\rm nr}(\mathbf x), \] This is basically the equation of motion, but it still needs some massaging in order to get it in the form of a Schrödinger equation. This is done in the collapsible below.

Details of the derivation

Multiplying the above equation with $\Omega^{-1}$ from left, we obtain \[ \frac{1}{2m}\left[ \Omega^{-1}(\bm\sigma\cdot\hat{\bm\pi})^2\Omega^{-1} - \Omega^{-1}(\bm\sigma\cdot\hat{\bm\pi})\frac{E'-q\varphi}{2mc^2} (\bm\sigma\cdot\hat{\bm\pi})\Omega^{-1}\right]\psi_{\rm nr}(\mathbf x) = \Omega^{-1}\left(E'-q\varphi\right)\Omega^{-1} \psi_{\rm nr}(\mathbf x), \] where there are three terms of the form $\Omega^{-1}C \Omega^{-1}$. They also contain some higher order contributions, which are to be neglected. Let us adopt a shorthand $A=\bm\sigma\cdot\hat{\bm\pi}$ and $B=E'-q\varphi(\mathbf r)$ and write these terms as \[ \begin{aligned} \Omega^{-1}(\bm\sigma\cdot\hat{\bm\pi})^2\Omega^{-1} &= A^2 - \frac{A^4}{4m^2c^2} + A^2\mathcal O\left(\frac{v^4}{c^4}\right),\\ \Omega^{-1}\left(E'{-}q\varphi\right)\Omega^{-1} &= B - \frac{1}{8m^2c^2}\left(A^2 B+B A^2\right) + B \mathcal O\left(\frac{v^4}{c^4}\right),\\ \Omega^{-1}(\bm\sigma\cdot\hat{\bm\pi})\frac{E'{-}q\varphi}{2mc^2} (\bm\sigma\cdot\hat{\bm\pi})\Omega^{-1} &= \frac{1}{2mc^2}ABA\left[ 1 + \mathcal O\left(\frac{v^4}{c^4}\right)\right]. \end{aligned} \] Ignoring all the terms $B \mathcal O\left({v^4}/{c^4}\right)$ and $A^2\mathcal O\left({v^4}/{c^4}\right)$, and rearranging most of the terms to the left-hand side, we arrive at \[ \frac 1 {2m}\left( A^2 - \frac{1}{4m^3c^2} A^4 + \frac{1}{4mc^2}\left(A^2B-2ABA+ BA^2\right)\right) \psi_{\rm nr} = B\psi_{\rm nr}, \] where we notice the structure \[ A^2B-2ABA+BA^2 = [A, AB-BA] = [A,[A,B]]. \] We need to compute these commutators. Because these are operators, we make them operate on a test function $f=f(\mathbf x)$, so that the scope of the derivative is easier to see. \[ \begin{aligned} [A,B]f &= [\bm\pi\cdot\bm\sigma, E'-q\varphi(\mathbf x)]f = [\mathbf p\cdot\bm\sigma, -q\varphi(\mathbf x)]f\\ &= iq\bm\sigma\cdot[\nabla,\varphi(\mathbf x)]f(\mathbf x) = iq \bm\sigma\cdot\left\{\nabla[\varphi(\mathbf x)f(\mathbf x)] - \varphi(\mathbf x) \nabla f(\mathbf x)\right\}\\ &= iq \bm\sigma\cdot\left\{(\nabla\varphi)f + \varphi(\nabla f) - \varphi (\nabla f)\right\}\\ &= iq \bm\sigma\cdot(\nabla\varphi)f. \end{aligned} \]

The outer commutator is \[ \begin{aligned} [A,[A,B]] f &= iq [\bm\pi\cdot\bm\sigma,\; \bm\sigma{\cdot}(\nabla\varphi)]f = iq\sum_{i,j=1}^3 [\sigma_i\pi_i,\; \sigma_j(\partial_j\varphi)]f\\ & = iq\sum_{i,j=1}^3 \left\{\sigma_i[\pi_i, \sigma_j(\partial_j\varphi)]+ [\sigma_i, \sigma_j(\partial_j\varphi)]\pi_i\right\}f\\ & = iq\sum_{i,j=1}^3 \left\{\sigma_i\sigma_j[\pi_i, (\partial_j\varphi)]+ [\sigma_i, \sigma_j](\partial_j\varphi)\pi_i\right\}f, \end{aligned} \] where the commutators evaluate to \[ \begin{aligned} [\pi_i,(\partial_j\varphi)]f&=[-i\partial_i-q A_i,(\partial_j\varphi)]f = -i[\partial_i,(\partial_j\varphi)]f\\ &= -i\partial_i[(\partial_j\varphi)f]+i(\partial_j\varphi)(\partial_i f) = -i(\partial_i\partial_j\varphi)f,\\ %[\pi_i,\sigma_j]&=0, \end{aligned} \] and \[ %[\sigma_i,\partial_j\varphi]&=0,\\ [\sigma_i,\sigma_j]=2i\sum_{k=1}^3\epsilon_{ijk} \sigma_k. \] Finally, the outer commutator evaluates to \[ \begin{aligned} [A,[A,B]] &= iq \sum_{i,j=1}^3\left( -i\sigma_i\sigma_j\partial_i\partial_j\varphi + 2i \epsilon_{ijk}\sigma_k(\partial_j\varphi)\pi_i\right)\\ &= q(\nabla^2\varphi) + 2q\bm\sigma\cdot (\nabla\varphi\times\bm\pi). \end{aligned} \] The latter terms can also be written as $-2q\bm\sigma\cdot(\bm\pi\times\nabla\varphi)$, since \[ \begin{aligned} (\mathbf p\times\nabla\varphi)_i f &= -i \sum_{jk}\epsilon_{ijk} \partial_j (\partial_k\varphi)f\\ &= -i\sum_{jk}\epsilon_{ijk} \left[(\partial_j\partial_k\varphi)f+(\partial_k\varphi)\partial_j f\right]\\ &= +i\sum_{jk}\epsilon_{ikj} (\partial_k\varphi)\partial_j f = -(\nabla\varphi\times \mathbf p)_if, \end{aligned} \] where the term $\epsilon_{ijk} (\partial_j\partial_k\varphi)$ vanishes since $\epsilon_{ijk}$ is antisymmetric and $\partial_j\partial_k\varphi$ is symmetric in the swap $j\leftrightarrow k$.

After all the algebra, we end up with \[ \left[ \frac{(\bm\sigma\cdot\bm\pi)^2}{2m} - \frac{(\bm\sigma\cdot\bm\pi)^4}{8m^3 c^2} - \frac{q}{4m^2c^2}\bm\sigma\cdot(\bm\pi\times\nabla\varphi)+\frac{q}{8m^2c^2}\nabla^2\varphi \right] \psi_{\rm nr}(\mathbf x) = \left(E'-q\varphi\right) \psi_{\rm nr}(\mathbf x). \] Comparing with the Pauli equation above, we have \[ \left(\hat H_{\rm Pauli} + \hat H_{\rm Rel}\right)\psi_{\rm nr}(\mathbf x)= E'\psi_{\rm nr}(\mathbf x), \] where the relativistic corrections to the Hamiltonian are \[ \begin{aligned} \hat H_{\rm Rel} &= - \frac{(\bm\sigma\cdot\bm\pi)^4}{8m^3 c^2} - \frac{q}{4m^2c^2}\bm\sigma\cdot(\bm\pi\times\nabla\varphi)+\frac{q}{8m^2c^2}\nabla^2\varphi\\ &\equiv \hat H_{\rm kinetic} + \hat H_{\rm SO} + \hat H_{\rm Darwin}. \end{aligned} \] Let us now try to understand the physics of these various terms in a situation with only an electric potential. We set $\mathbf A=0$, so that $\mathbf B=0$, $\bm\pi=\mathbf p$ and $\mathbf E=-\nabla\varphi$.

Relativistic correction to the kinetic energy

The first term, \[ \hat H_{\rm kinetic} \equiv -\frac{(\bm\sigma\cdot\bm\pi)^4}{8m^3c^2} \stackrel{\mathbf A=0}{=} - \frac{|\hat{\mathbf p}|^4}{8m^3c^2}, \] is a correction to the kinetic energy, consistent with the series expansion of the relativistic dispersion relation \[ E=c\sqrt{|\mathbf p|^2+m^2c^2} = mc^2 + \frac{|\mathbf p|^2}{2m}-\frac{|\mathbf p|^4}{8m^3c^2} + \dots \]

Spin-orbit interaction

The second term of the Hamiltonian is arguably the most interesting: \[ \begin{aligned} \hat H_{\rm SO} &= -\frac{q}{4m^2c^2} \bm\sigma\cdot(\bm\pi\times\nabla\varphi)\\ &\mkern-7mu\stackrel{\mathbf A=0}{=}\mkern-7mu +\frac{q}{4m^2c^2} \bm\sigma\cdot(\nabla\varphi\times\mathbf p) \end{aligned} \] The Hamiltonian $\hat H_{\rm SO}$ is known as the spin-orbit interaction. To understand the origin of this term, we go to the special case of central field $\varphi(\mathbf x) = \varphi(r)$. The gradient is \[ \nabla\varphi=\frac{d\varphi(r)}{d r} \frac{\mathbf x}{r}, \] and the cross product above becomes \[ \nabla\varphi\times\hat{\mathbf p} = \frac 1 r \varphi'(r)\left(\mathbf x\times\hat{\mathbf p}\right) = \frac 1 r \varphi'(r) \hat{\mathbf L}, \] where $\hat{\mathbf L}$ is the orbital angular momentum operator. In terms of the spin angular momentum $\mathbf S = {\bm\sigma}/2$, this part of the Hamiltonian can be written as \[ \hat H_{\rm SO,\text{central field}} = + \frac{q}{2m^2c^2} \frac 1 r \frac{d\varphi}{dr} \hat{\mathbf L}\cdot\mathbf{S}. \] This coupling hybridizes spin and orbital angular momentum states, hence the name. The coupling between ${\mathbf L}$ and $\mathbf{S}$ has an important effect of breaking degeneracies of many systems, e.g. hydrogen; Because $[\hat H_{\rm so},\hat{\mathbf L}]\neq0$ and $[\hat H_{\rm so},\hat{\mathbf S}]\neq0$, neither $m_l$ nor $m_s$ are good quantum numbers any more. Spin-orbit interaction also breaks degeneracies in crystals and enables many interesting condensed matter phenomena. As with other relativistic effects, spin-orbit interaction is stronger in the heavier elements of the periodic table.

Darwin term

The third term is known as the Darwin term: \[ \hat H_{\rm Darwin} \equiv + \frac{q}{8m^2c^2}\nabla^2\varphi \stackrel{\mathbf A=0}{=} - \frac{q}{8m^2c^2}\nabla\cdot\mathbf E \]

For a point charge $Ze>0$ and $q=-e$ located at $\mathbf r=0$, we can determine the potential $\varphi$ from the Poisson equation \[ \begin{aligned} \nabla^2 \varphi &= +\nabla\cdot(\nabla\varphi) = -\nabla\cdot\mathbf E = -\rho/\varepsilon_0 = -\frac{Ze}{\varepsilon_0} \delta^{(0)}(\mathbf r), \end{aligned} \] which has a solution \[ \varphi(r) = \frac{Ze}{4\pi \varepsilon_0 r}. \] We also set $q=-e$ so that the particle feels the attractive potential \[ V(r)= -e\varphi(r) = -\frac{Ze^2}{4\pi \varepsilon_0 r}. \]

The Darwin term in a central potential is \[ \hat H_{\rm Darwin} = \frac{1}{8m^2c^2}\nabla^2 (-e\varphi) = \frac{1}{8m^2c^2}\frac{Ze^2}{\varepsilon_0} \delta^{(3)}(\mathbf r). \] The Darwin term only affects the energy of an s-orbital, since the wave function for other orbitals vanishes at the origin.