6 The Cauchy Schwarz Inequality
A very important corollary of the axioms of \(\mathcal{I}\) is The Cauchy-Schwarz inequality (CS):
\[\left|\langle a|b\rangle\right|\le\left\Vert |a\rangle\right\Vert \left\Vert |b\rangle\right\Vert.\]
Assuming the norm induced by the inner product, this takes the form:
\[\left|\langle a|b\rangle\right|\le\sqrt{\langle a|a\rangle\langle b|b\rangle}.\]
Proof:
There are various proofs of this statement. I will provide one example here.
Define a function
\[p\left(t\right)\triangleq\left\Vert t\alpha|a\rangle+|b\rangle\right\Vert ^{2}.\]
Here,
\[p:\mathbb{R}\rightarrow\mathbb{R}\]
meaning the function takes \(t\in\mathbb{R}\) as an argument and returns \(p\left(t\right)\in\mathbb{R}\).
Define \(\alpha\in\mathbb{C}\) as follows:
\[\begin{aligned} \alpha\langle a|b\rangle\triangleq\left|\langle a|b\rangle\right|, & \langle a|b\rangle\ne0\\ \alpha\triangleq1, & \langle a|b\rangle=0. \end{aligned}\]
Furthermore, we’ll take the norm induced by the inner product. Therefore, expanding \(p\left(t\right)\), we have
\[\begin{aligned} p\left(t\right) & =\left(t\alpha^{*}\langle a|+\langle b|\right)\left(t\alpha|a\rangle+|b\rangle\right)\\ & =t^{2}\left|\alpha\right|^{2}\langle a|a\rangle+\langle b|b\rangle+t\left(\alpha^{*}\langle a|b\rangle+\alpha\langle b|a\rangle\right)\\ & =t^{2}\left|\alpha\right|^{2}\left\Vert |a\rangle\right\Vert ^{2}+\left\Vert |b\rangle\right\Vert ^{2}+t\left(\alpha^{*}\langle a|b\rangle+\alpha\langle a|b\rangle\right)\\ & =t^{2}\left|\alpha\right|^{2}\left\Vert |a\rangle\right\Vert ^{2}+\left\Vert |b\rangle\right\Vert ^{2}+2t\mathfrak{Re}\left(\alpha\langle a|b\rangle\right). \end{aligned}\]
Now use the definition of \(\alpha\) to rewrite the first and last terms, noting that the definition implies \(\left|\alpha\right|=1\):
\[p\left(t\right)=t^{2}\left\Vert |a\rangle\right\Vert ^{2}+\left\Vert |b\rangle\right\Vert ^{2}+2t\left|\langle a|b\rangle\right|.\]
First let’s check the case \(|a\rangle=|0\rangle\). In this case, CS states
\[\left|\langle0|b\rangle\right|\le\left\Vert |0\rangle\right\Vert \left\Vert |b\rangle\right\Vert .\]
\[\begin{aligned} \text{[{l}\textbf{I1}]} & \implies\langle0|b\rangle=0\\ \text{[\textbf{I3}]} & \implies\left\Vert |0\rangle\right\Vert \end{aligned}\]
therefore CS is trivially satisfied, with exact equality.
Next let’s check \(|a\rangle\ne|0\rangle\). In this case, \(p\in\mathbb{P}_{2}\) (meaning \(p\) is a 2nd degree polynomial). However, since \(p\) is defined to be the square of a norm, [I3] implies that \(p>0\). Hence, \(p\) cannot change sign, and hence it has zero real roots. By the theory of quadratic equations, this means that the discriminant
\[\Delta\triangleq4\left(\left|\langle a|b\rangle\right|^{2}-\left\Vert |a\rangle\right\Vert ^{2}\left\Vert |b\rangle\right\Vert ^{2}\right)<0.\]
Hence,
\[\begin{aligned} \left|\langle a|b\rangle\right|^{2} & <\left\Vert |a\rangle\right\Vert ^{2}\left\Vert |b\rangle\right\Vert ^{2}\\ \left|\langle a|b\rangle\right| & <\left\Vert |a\rangle\right\Vert \left\Vert |b\rangle\right\Vert \end{aligned}\]
in the case that \(|a\rangle\ne|0\rangle\). Combining with the special case that \(|a\rangle=|0\rangle\) gives the final result:
\[\left|\langle a|b\rangle\right|\le\left\Vert |a\rangle\right\Vert \left\Vert |b\rangle\right\Vert\]
which is CS.
QED
6.1 Application of Cauchy Schwarz: The Heisenberg Uncertainty Principle
The proof proceeds as follows.
A. Define a few key objects used generally in quantum mechanics.
B. Define a few objects required for this specific proof.
C. Establish results \(\left(i\right)-\left(vi\right)\) required for the proof.
D. Carry out the proof.
A. Define a few key objects used in quantum mechanics
Consider a Hermitian operator \(\hat{A}\). Define the expectation value of \(\hat{A}\) for the state \(|\psi\rangle\) to be
\[\langle\hat{A}\rangle\triangleq\langle\psi|\hat{A}|\psi\rangle\]
and the ‘uncertainty’ in \(\hat{A}\) to be its standard deviation:
\[\Delta A\triangleq\sqrt{\langle\hat{A}^{2}\rangle-\langle\hat{A}\rangle^{2}.}\]
Finally, define the
\[\begin{aligned} \text{commutator:}\quad\left[\hat{A},\hat{B}\right] & \triangleq\hat{A}\hat{B}-\hat{B}\hat{A}\\ \text{anticommutator:}\quad\left\{ \hat{A},\hat{B}\right\} & \triangleq\hat{A}\hat{B}+\hat{B}\hat{A}. \end{aligned}\]
B. Define a few objects required for this specific proof
Let
\[\delta\hat{A}\triangleq\hat{A}-\langle\hat{A}\rangle\]
such that
\[\left(\Delta A\right)^{2}=\langle\left(\delta\hat{A}\right)^{2}\rangle. \tag{6.1}\]
We will also define
\[\begin{aligned} |a\rangle & \triangleq\delta\hat{A}|\psi\rangle\\ |b\rangle & \triangleq\delta\hat{B}|\psi\rangle. \end{aligned}\]
C. Establish results \(\left(i\right)-\left(vi\right)\) required for the proof.
\[\left(i\right)\qquad\delta\hat{A}\delta\hat{B}=\frac{1}{2}\left\{ \delta\hat{A},\delta\hat{B}\right\} +\frac{1}{2}\left[\delta\hat{A},\delta\hat{B}\right]\label{eq:dAdB}\]
as can be seen by expanding the right hand side;
\[\left(ii\right)\qquad\left[\delta\hat{A},\delta\hat{B}\right]=\left[\hat{A},\hat{B}\right]\label{eq:dAdB2}\]
similarly seen by expanding the right hand side.
\[\left(iii\right)\qquad\left(\left\{ \delta\hat{A},\delta\hat{B}\right\} \right)^{\dagger}=\left\{ \delta\hat{A},\delta\hat{B}\right\}\]
i.e. the anticommutator is Hermitian, and
\[\left(iv\right)\qquad\left(\left[\delta\hat{A},\delta\hat{B}\right]\right)^{\dagger}=-\left[\delta\hat{A},\delta\hat{B}\right]\]
i.e. the commutator is antihermitian.
Now we establish two important corollaries.
From \(\left(iii\right)\) it follows that the expectation value of the anticommutator is always purely real:
\[\begin{aligned} \left\{ \delta\hat{A},\delta\hat{B}\right\} & =\left(\left\{ \delta\hat{A},\delta\hat{B}\right\} \right)^{\dagger}\\ \left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle & =\left\langle \left(\left\{ \delta\hat{A},\delta\hat{B}\right\} \right)^{\dagger}\right\rangle \\ \left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle & =\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle ^{*} \end{aligned}\]
which gives us
\[\left(v\right)\qquad\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle \in\mathbb{R}.\]
Similarly, from \(\left(iv\right)\), the expectation value of the commutator is always purely imaginary:
\[\left(vi\right)\qquad\left\langle \left[\delta\hat{A},\delta\hat{B}\right]\right\rangle \in i\mathbb{R}.\]
D. Carry out the proof
From the definitions in B we have
\[\begin{aligned} \langle a|a\rangle & =\langle\psi|\delta\hat{A}\delta\hat{A}|\psi\rangle\\ & =\langle\psi|\left(\delta\hat{A}\right)^{2}|\psi\rangle\\ & =\left(\Delta A\right)^{2} \end{aligned}\]
where the last line follows from Equation 6.1.
Now invoke CS:
\[\begin{aligned} \left\Vert |a\rangle\right\Vert ^{2}\left\Vert |b\rangle\right\Vert ^{2} & \ge\left|\langle a|b\rangle\right|^{2}\\ & \downarrow\\ \langle a|a\rangle\langle b|b\rangle & \ge\left|\langle a|b\rangle\right|^{2}\\ & \downarrow\\ \left(\Delta A\right)^{2}\left(\Delta B\right)^{2} & \ge\left|\langle\psi|\delta\hat{A}\delta\hat{B}|\psi\rangle\right|^{2}. \end{aligned}\]
Using \(\left(i\right)\) and \(\left(ii\right)\):
\[\left(i\right)\implies\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle +\left\langle \left[\delta\hat{A},\delta\hat{B}\right]\right\rangle \right|^{2}\]
and
\[\left(ii\right)\implies\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle +\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]
But \(\left(v\right)\) and \(\left(vi\right)\) tell us that the two terms within the modulus are purely real, and purely imaginary, respectively. Recalling that
\[\left|x+iy\right|^{2}=x^{2}+y^{2},\,\,\left\{ x,y\right\} \in\mathbb{R}\]
leads us to conclude that
\[\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle \right|^{2}+\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]
But it is also the case that
\[x^{2}+y^{2}\ge y^{2}\]
and so
\[\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle \right|^{2}+\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}\ge\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]
Hence,
\[\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]
Square rooting gives the final result, The Heisenberg Uncertainty principle:
\[\Delta A\Delta B\ge\frac{1}{2}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|.\]
6.2 The Triangle Inequality
Using the axioms of inner product spaces we proved CS. Conversely, assuming CS we can prove [N4], the triangle inequality.
Proof:
Consider
\[\begin{aligned} \left\Vert |x\rangle+|y\rangle\right\Vert ^{2} & =\left(\langle x|+\langle y|\right)\left(|x\rangle+|y\rangle\right)\\ & =\langle x|x\rangle+\langle y|y\rangle+\langle x|y\rangle+\langle y|x\rangle\\ & =\langle x|x\rangle+\langle y|y\rangle+2\mathfrak{Re}\left(\langle x|y\rangle\right). \end{aligned}\]
Note that \(\forall z\in\mathbb{C}\), \(\left|z\right|\ge\mathfrak{Re}\left(z\right)\) (just write \(z=x+iy\) and expand). Therefore
\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\langle x|x\rangle+\langle y|y\rangle+2\left|\langle x|y\rangle\right|.\]
Now use CS, which says that
\[\left|\langle x|y\rangle\right|\le\left\Vert |x\rangle\right\Vert \left\Vert |y\rangle\right\Vert\]
to give
\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\langle x|x\rangle+\langle y|y\rangle+2\left\Vert |x\rangle\right\Vert \left\Vert |y\rangle\right\Vert .\]
Therefore
\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\left\Vert |x\rangle\right\Vert ^{2}+\left\Vert |y\rangle\right\Vert ^{2}+2\left\Vert |x\rangle\right\Vert \left\Vert |y\rangle\right\Vert\]
and so
\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\left(\left\Vert |x\rangle\right\Vert +\left\Vert |y\rangle\right\Vert \right)^{2}\]
or
\[\left\Vert |x\rangle+|y\rangle\right\Vert \le\left\Vert |x\rangle\right\Vert +\left\Vert |y\rangle\right\Vert\]
which is the triangle inequality, [N4].