6  The Cauchy Schwarz Inequality

A very important corollary of the axioms of \(\mathcal{I}\) is The Cauchy-Schwarz inequality (CS):

\[\left|\langle a|b\rangle\right|\le\left\Vert |a\rangle\right\Vert \left\Vert |b\rangle\right\Vert.\]

Assuming the norm induced by the inner product, this takes the form:

\[\left|\langle a|b\rangle\right|\le\sqrt{\langle a|a\rangle\langle b|b\rangle}.\]

Proof:

There are various proofs of this statement. I will provide one example here.

Define a function

\[p\left(t\right)\triangleq\left\Vert t\alpha|a\rangle+|b\rangle\right\Vert ^{2}.\]

Here,

\[p:\mathbb{R}\rightarrow\mathbb{R}\]

meaning the function takes \(t\in\mathbb{R}\) as an argument and returns \(p\left(t\right)\in\mathbb{R}\).

Define \(\alpha\in\mathbb{C}\) as follows:

\[\begin{aligned} \alpha\langle a|b\rangle\triangleq\left|\langle a|b\rangle\right|, & \langle a|b\rangle\ne0\\ \alpha\triangleq1, & \langle a|b\rangle=0. \end{aligned}\]

Furthermore, we’ll take the norm induced by the inner product. Therefore, expanding \(p\left(t\right)\), we have

\[\begin{aligned} p\left(t\right) & =\left(t\alpha^{*}\langle a|+\langle b|\right)\left(t\alpha|a\rangle+|b\rangle\right)\\ & =t^{2}\left|\alpha\right|^{2}\langle a|a\rangle+\langle b|b\rangle+t\left(\alpha^{*}\langle a|b\rangle+\alpha\langle b|a\rangle\right)\\ & =t^{2}\left|\alpha\right|^{2}\left\Vert |a\rangle\right\Vert ^{2}+\left\Vert |b\rangle\right\Vert ^{2}+t\left(\alpha^{*}\langle a|b\rangle+\alpha\langle a|b\rangle\right)\\ & =t^{2}\left|\alpha\right|^{2}\left\Vert |a\rangle\right\Vert ^{2}+\left\Vert |b\rangle\right\Vert ^{2}+2t\mathfrak{Re}\left(\alpha\langle a|b\rangle\right). \end{aligned}\]

Now use the definition of \(\alpha\) to rewrite the first and last terms, noting that the definition implies \(\left|\alpha\right|=1\):

\[p\left(t\right)=t^{2}\left\Vert |a\rangle\right\Vert ^{2}+\left\Vert |b\rangle\right\Vert ^{2}+2t\left|\langle a|b\rangle\right|.\]

First let’s check the case \(|a\rangle=|0\rangle\). In this case, CS states

\[\left|\langle0|b\rangle\right|\le\left\Vert |0\rangle\right\Vert \left\Vert |b\rangle\right\Vert .\]

\[\begin{aligned} \text{[{l}\textbf{I1}]} & \implies\langle0|b\rangle=0\\ \text{[\textbf{I3}]} & \implies\left\Vert |0\rangle\right\Vert \end{aligned}\]

therefore CS is trivially satisfied, with exact equality.

Next let’s check \(|a\rangle\ne|0\rangle\). In this case, \(p\in\mathbb{P}_{2}\) (meaning \(p\) is a 2nd degree polynomial). However, since \(p\) is defined to be the square of a norm, [I3] implies that \(p>0\). Hence, \(p\) cannot change sign, and hence it has zero real roots. By the theory of quadratic equations, this means that the discriminant

\[\Delta\triangleq4\left(\left|\langle a|b\rangle\right|^{2}-\left\Vert |a\rangle\right\Vert ^{2}\left\Vert |b\rangle\right\Vert ^{2}\right)<0.\]

Hence,

\[\begin{aligned} \left|\langle a|b\rangle\right|^{2} & <\left\Vert |a\rangle\right\Vert ^{2}\left\Vert |b\rangle\right\Vert ^{2}\\ \left|\langle a|b\rangle\right| & <\left\Vert |a\rangle\right\Vert \left\Vert |b\rangle\right\Vert \end{aligned}\]

in the case that \(|a\rangle\ne|0\rangle\). Combining with the special case that \(|a\rangle=|0\rangle\) gives the final result:

\[\left|\langle a|b\rangle\right|\le\left\Vert |a\rangle\right\Vert \left\Vert |b\rangle\right\Vert\]

which is CS.

QED

6.1 Application of Cauchy Schwarz: The Heisenberg Uncertainty Principle

The proof proceeds as follows.

A. Define a few key objects used generally in quantum mechanics.

B. Define a few objects required for this specific proof.

C. Establish results \(\left(i\right)-\left(vi\right)\) required for the proof.

D. Carry out the proof.

A. Define a few key objects used in quantum mechanics

Consider a Hermitian operator \(\hat{A}\). Define the expectation value of \(\hat{A}\) for the state \(|\psi\rangle\) to be

\[\langle\hat{A}\rangle\triangleq\langle\psi|\hat{A}|\psi\rangle\]

and the ‘uncertainty’ in \(\hat{A}\) to be its standard deviation:

\[\Delta A\triangleq\sqrt{\langle\hat{A}^{2}\rangle-\langle\hat{A}\rangle^{2}.}\]

Finally, define the

\[\begin{aligned} \text{commutator:}\quad\left[\hat{A},\hat{B}\right] & \triangleq\hat{A}\hat{B}-\hat{B}\hat{A}\\ \text{anticommutator:}\quad\left\{ \hat{A},\hat{B}\right\} & \triangleq\hat{A}\hat{B}+\hat{B}\hat{A}. \end{aligned}\]

B. Define a few objects required for this specific proof

Let

\[\delta\hat{A}\triangleq\hat{A}-\langle\hat{A}\rangle\]

such that

\[\left(\Delta A\right)^{2}=\langle\left(\delta\hat{A}\right)^{2}\rangle. \tag{6.1}\]

We will also define

\[\begin{aligned} |a\rangle & \triangleq\delta\hat{A}|\psi\rangle\\ |b\rangle & \triangleq\delta\hat{B}|\psi\rangle. \end{aligned}\]

C. Establish results \(\left(i\right)-\left(vi\right)\) required for the proof.

\[\left(i\right)\qquad\delta\hat{A}\delta\hat{B}=\frac{1}{2}\left\{ \delta\hat{A},\delta\hat{B}\right\} +\frac{1}{2}\left[\delta\hat{A},\delta\hat{B}\right]\label{eq:dAdB}\]

as can be seen by expanding the right hand side;

\[\left(ii\right)\qquad\left[\delta\hat{A},\delta\hat{B}\right]=\left[\hat{A},\hat{B}\right]\label{eq:dAdB2}\]

similarly seen by expanding the right hand side.

\[\left(iii\right)\qquad\left(\left\{ \delta\hat{A},\delta\hat{B}\right\} \right)^{\dagger}=\left\{ \delta\hat{A},\delta\hat{B}\right\}\]

i.e. the anticommutator is Hermitian, and

\[\left(iv\right)\qquad\left(\left[\delta\hat{A},\delta\hat{B}\right]\right)^{\dagger}=-\left[\delta\hat{A},\delta\hat{B}\right]\]

i.e. the commutator is antihermitian.

Now we establish two important corollaries.

From \(\left(iii\right)\) it follows that the expectation value of the anticommutator is always purely real:

\[\begin{aligned} \left\{ \delta\hat{A},\delta\hat{B}\right\} & =\left(\left\{ \delta\hat{A},\delta\hat{B}\right\} \right)^{\dagger}\\ \left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle & =\left\langle \left(\left\{ \delta\hat{A},\delta\hat{B}\right\} \right)^{\dagger}\right\rangle \\ \left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle & =\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle ^{*} \end{aligned}\]

which gives us

\[\left(v\right)\qquad\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle \in\mathbb{R}.\]

Similarly, from \(\left(iv\right)\), the expectation value of the commutator is always purely imaginary:

\[\left(vi\right)\qquad\left\langle \left[\delta\hat{A},\delta\hat{B}\right]\right\rangle \in i\mathbb{R}.\]

D. Carry out the proof

From the definitions in B we have

\[\begin{aligned} \langle a|a\rangle & =\langle\psi|\delta\hat{A}\delta\hat{A}|\psi\rangle\\ & =\langle\psi|\left(\delta\hat{A}\right)^{2}|\psi\rangle\\ & =\left(\Delta A\right)^{2} \end{aligned}\]

where the last line follows from Equation 6.1.

Now invoke CS:

\[\begin{aligned} \left\Vert |a\rangle\right\Vert ^{2}\left\Vert |b\rangle\right\Vert ^{2} & \ge\left|\langle a|b\rangle\right|^{2}\\ & \downarrow\\ \langle a|a\rangle\langle b|b\rangle & \ge\left|\langle a|b\rangle\right|^{2}\\ & \downarrow\\ \left(\Delta A\right)^{2}\left(\Delta B\right)^{2} & \ge\left|\langle\psi|\delta\hat{A}\delta\hat{B}|\psi\rangle\right|^{2}. \end{aligned}\]

Using \(\left(i\right)\) and \(\left(ii\right)\):

\[\left(i\right)\implies\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle +\left\langle \left[\delta\hat{A},\delta\hat{B}\right]\right\rangle \right|^{2}\]

and

\[\left(ii\right)\implies\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle +\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]

But \(\left(v\right)\) and \(\left(vi\right)\) tell us that the two terms within the modulus are purely real, and purely imaginary, respectively. Recalling that

\[\left|x+iy\right|^{2}=x^{2}+y^{2},\,\,\left\{ x,y\right\} \in\mathbb{R}\]

leads us to conclude that

\[\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle \right|^{2}+\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]

But it is also the case that

\[x^{2}+y^{2}\ge y^{2}\]

and so

\[\frac{1}{4}\left|\left\langle \left\{ \delta\hat{A},\delta\hat{B}\right\} \right\rangle \right|^{2}+\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}\ge\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]

Hence,

\[\left(\Delta A\right)^{2}\left(\Delta B\right)^{2}\ge\frac{1}{4}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|^{2}.\]

Square rooting gives the final result, The Heisenberg Uncertainty principle:

\[\Delta A\Delta B\ge\frac{1}{2}\left|\left\langle \left[\hat{A},\hat{B}\right]\right\rangle \right|.\]

6.2 The Triangle Inequality

Using the axioms of inner product spaces we proved CS. Conversely, assuming CS we can prove [N4], the triangle inequality.

Proof:

Consider

\[\begin{aligned} \left\Vert |x\rangle+|y\rangle\right\Vert ^{2} & =\left(\langle x|+\langle y|\right)\left(|x\rangle+|y\rangle\right)\\ & =\langle x|x\rangle+\langle y|y\rangle+\langle x|y\rangle+\langle y|x\rangle\\ & =\langle x|x\rangle+\langle y|y\rangle+2\mathfrak{Re}\left(\langle x|y\rangle\right). \end{aligned}\]

Note that \(\forall z\in\mathbb{C}\), \(\left|z\right|\ge\mathfrak{Re}\left(z\right)\) (just write \(z=x+iy\) and expand). Therefore

\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\langle x|x\rangle+\langle y|y\rangle+2\left|\langle x|y\rangle\right|.\]

Now use CS, which says that

\[\left|\langle x|y\rangle\right|\le\left\Vert |x\rangle\right\Vert \left\Vert |y\rangle\right\Vert\]

to give

\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\langle x|x\rangle+\langle y|y\rangle+2\left\Vert |x\rangle\right\Vert \left\Vert |y\rangle\right\Vert .\]

Therefore

\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\left\Vert |x\rangle\right\Vert ^{2}+\left\Vert |y\rangle\right\Vert ^{2}+2\left\Vert |x\rangle\right\Vert \left\Vert |y\rangle\right\Vert\]

and so

\[\left\Vert |x\rangle+|y\rangle\right\Vert ^{2}\le\left(\left\Vert |x\rangle\right\Vert +\left\Vert |y\rangle\right\Vert \right)^{2}\]

or

\[\left\Vert |x\rangle+|y\rangle\right\Vert \le\left\Vert |x\rangle\right\Vert +\left\Vert |y\rangle\right\Vert\]

which is the triangle inequality, [N4].