17  Sturm Liouville Theory

It is natural to wonder why so many different orthogonal polynomials exist, with such similar underlying properties. The fact that they live in Hilbert spaces is a characterisation of this similarity, rather than a proof that it must be the case. The underlying connection is provided by Sturm Liouville theory. It turns out that all the orthogonal polynomials we have considered are special cases of this more general form.

17.1 The Regular Sturm Liouville Problem statement

Given functions \(p\left(x\right)\), \(q\left(x\right)\), the weight function \(w\left(x\right)\), a real domain \(x\in\left[a,b\right]\), and ‘separated’ boundary conditions, the regular Sturm Liouville problem is the second order linear ODE

\[\begin{aligned} \left(p\left(x\right)y'\left(x\right)\right)'+q\left(x\right)y\left(x\right) & =-\lambda w\left(x\right)y\left(x\right),\quad x\in\left[a,b\right]\\ c_{1}y\left(a\right)+c_{2}y'\left(a\right) & =0\quad\text{(boundary condition 1)}\\ d_{1}y\left(b\right)+d_{2}y'\left(b\right) & =0\quad\text{(boundary condition 2)} \end{aligned} \tag{17.1}\]

where:

(i)

\[\begin{aligned} \left(c_{1},c_{2}\right) & \ne\left(0,0\right)\\ \left(d_{1},d_{2}\right) & \ne\left(0,0\right) \end{aligned}\]

(ii) \(p\), \(p'\), \(q\), and \(w\) are continuous on \(\left[a,b\right]\)

(iii) \(p\) and \(w\) are positive definite on \(\left[a,b\right]\).

Together, these conditions guarantee that solutions to Equation 17.1 exist (according to the standard theory of linear ODEs).

A solution to the regular Sturm Liouville problem is defined to be a non-trivial, continuously differentiable function \(y\left(x\right)\) which solves Equation 17.1 on the interval \(\left(a,b\right)\), and which matches the boundary conditions. Such solutions are called eigenfunctions. These will only exist for certain values of \(\lambda\), which are called the eigenvalues.

It can be shown that the Hermite, Legendre, and Laguerre polynomials are special cases of the Sturm Liouville problem, as is Fourier theory.

17.2 Example: simple harmonic oscillator

Consider the simple harmonic oscillator

\[\begin{aligned} y''+\lambda y & =0,\quad x\in\left[0,\pi\right]\\ y\left(0\right) & =0\\ y\left(\pi\right) & =0. \end{aligned}\]

The solutions satisfying the boundary conditions are

\[y_{n}\left(x\right)=A\sin\left(nx\right),\quad n\in\mathbb{N}\]

with the eigenvalues

\[\lambda_{n}=n^{2}.\]

17.3 Properties of Sturm Liouville solutions

17.3.1 Real Eigenvalues

The eigenvalues of the regular Sturm Liouville problem are real:

\[\lambda_{n}\in\mathbb{R}.\]

17.3.2 Distinct Eigenvalues

The eigenvalues of the regular Sturm Liouville problem are non-degenerate:

\[\lambda_{1}<\lambda_{2}<\lambda_{3}\ldots\]

17.3.3 Infinite number of eigenvalues

There exists a countably infinite number of eigenvalues: \(n\in\aleph_{0}\).

17.3.4 Infinite extent of eigenvalues

The largest eigenvalue is infinitely large:

\[\lim_{n\rightarrow\aleph_{0}}\lambda_{n}=\infty.\]

17.3.5 Uniqueness of (normalised) Eigenfunctions

The eigenfunction \(y_{n}\left(x\right)\) corresponding to eigenvalue \(\lambda_{n}\) is unique, up to a multiplicative constant. Equivalently, the normalised eigenfunctions are unique.

17.3.6 Number of zeroes

\(y_{n}\left(x\right)\) has precisely \(n-1\) zeroes in the interval \(\left[a,b\right]\).

17.3.7 Orthogonal eigenfunctions

Distinct eigenfunctions are orthogonal:

\(\langle y_{n}|y_{m}\rangle=\int^{b}_{a}y^{*}_{n}\left(x\right)y_{m}\left(x\right)w\left(x\right)\text{d}x=0\,\,\textrm{if }n\ne m.\)

17.3.8 Completeness

The normalised eigenfunctions \(y_{n}\left(x\right)\) form a complete orthonormal basis for the Hilbert space \(\mathcal{L}^{2}\left(\left[a,b\right],w\left(x\right)\text{d}x\right)\):

\[\langle y_{n}|y_{m}\rangle=\int^{b}_{a}y^{*}_{n}\left(x\right)y_{m}\left(x\right)w\left(x\right)\text{d}x=\delta_{nm}.\]

Equivalently, given any function \(c\left(x\right)\) in the interval, we can write

\[c\left(x\right)=\sum^{\infty}_{n=1}c_{n}y_{n}\left(x\right) \tag{17.2}\]

where

\[c_{n}=\frac{\langle y_{n}|c\rangle}{\langle y_{n}|y_{n}\rangle}=\frac{\int^{b}_{a}y^{*}_{n}\left(x\right)c\left(x\right)w\left(x\right)\text{d}x}{\int^{b}_{a}y^{*}_{n}\left(x\right)y_{n}\left(x\right)w\left(x\right)\text{d}x}.\]

The discrete sum in Equation 17.2 is at first sight remarkable, as nothing about the setup of the Sturm Liouville problem appeared to involve discreteness. At this point in the course it is probably expected! It derives from the constraints of the boundary conditions.

17.4 The Sturm Liouville Operator

One way to understand these properties is to notice that the Sturm Liouville problem can be written in terms of a Hermitian operator.

Define the differential operator

\[\hat{L}\triangleq-\frac{1}{w\left(x\right)}\left(\frac{\text{d}}{\text{d}x}\left(p\left(x\right)\frac{\text{d}}{\text{d}x}\right)+q\left(x\right)\right).\]

Then the Sturm Liouville problem reduces to

\[\hat{L}y_{n}\left(x\right)=\lambda_{n}y_{n}\left(x\right)\]

(with appropriate boundary conditions, and other conditions as specified in Section 1). Various properties outlined in Section 1.2 then follow immediately from the fact that \(\hat{L}\) is Hermitian: for example, that the eigenvalues are real, and that distinct eigenfunctions are orthogonal.

17.5 Proof of Hermiticity of the Sturm Liouville Operator \(\hat{L}\)

We are required to show that

\[\left(\hat{L}|f\rangle\right)^{\dagger}|g\rangle=\langle f|\left(\hat{L}|g\rangle\right)\]

or equivalently

\[\int^{b}_{a}\left(\hat{L}f\left(x\right)\right)^{*}g\left(x\right)w\left(x\right)\text{d}x=\int^{b}_{a}f^{*}\left(x\right)\left(\hat{L}g\left(x\right)\right)w\left(x\right)\text{d}x.\]

For simplicity, let’s consider the case \(w\left(x\right)=1\). (In fact this is without loss of generality, as we can incorporate \(w\left(x\right)\) into \(y\left(x\right)\).)

Inserting the definition of \(\hat{L}\) gives

\[\begin{aligned} \int^{b}_{a}\left(\hat{L}f\left(x\right)\right)^{*}g\left(x\right)\text{d}x= & \int^{b}_{a}\left(-\left(\frac{\text{d}}{\text{d}x}\left[p\left(x\right)f'\left(x\right)\right]+q\left(x\right)f\left(x\right)\right)\right)^{*}g\left(x\right)\text{d}x\\ = & -\int^{b}_{a}\left(\frac{\text{d}}{\text{d}x}\left[p\left(x\right)f'^{*}\left(x\right)\right]+q\left(x\right)f^{*}\left(x\right)\right)g\left(x\right)\text{d}x\\ = & -\int^{b}_{a}\frac{\text{d}}{\text{d}x}\left[p\left(x\right)f'^{*}\left(x\right)\right]g\left(x\right)\text{d}x-\int^{b}_{a}q\left(x\right)f^{*}\left(x\right)g\left(x\right)\text{d}x \end{aligned} \tag{17.3}\]

where we used the fact that \(p\) and \(q\) are specified to be real on the interval. Now integrate the first term of Equation 17.3 by parts:

\[\int^{b}_{a}\left(\hat{L}f\left(x\right)\right)^{*}g\left(x\right)\text{d}x=-\left[p\left(x\right)f'^{*}\left(x\right)g\left(x\right)\right]^{b}_{a}+\int^{b}_{a}p\left(x\right)f'^{*}\left(x\right)g'\left(x\right)\text{d}x-\int^{b}_{a}q\left(x\right)f^{*}\left(x\right)g\left(x\right)\text{d}x\]

and integrate the second term by parts:

\[\int^{b}_{a}\left(\hat{L}f\left(x\right)\right)^{*}g\left(x\right)\text{d}x=-\left[p\left(x\right)f'^{*}\left(x\right)g\left(x\right)\right]^{b}_{a}+\left[p\left(x\right)f{}^{*}\left(x\right)g'\left(x\right)\right]^{b}_{a}-\int^{b}_{a}f{}^{*}\left(x\right)\text{d}_{x}\left[p\left(x\right)g'\left(x\right)\right]\text{d}x-\int^{b}_{a}q\left(x\right)f^{*}\left(x\right)g\left(x\right)\text{d}x\]

and combine terms:

\[\begin{aligned} \int^{b}_{a}\left(\hat{L}f\left(x\right)\right)^{*}g\left(x\right)\text{d}x & =\left[p\left(x\right)\left(f^{*}\left(x\right)g'\left(x\right)-f'^{*}\left(x\right)g\left(x\right)\right)\right]^{b}_{a}-\int^{b}_{a}f{}^{*}\left(x\right)\left(\frac{\text{d}}{\text{d}x}\left[p\left(x\right)g'\left(x\right)\right]+q\left(x\right)g\left(x\right)\right)\text{d}x\\ & =\left[p\left(x\right)\left(f^{*}\left(x\right)g'\left(x\right)-f'^{*}\left(x\right)g\left(x\right)\right)\right]^{b}_{a}+\int^{b}_{a}f{}^{*}\left(x\right)\left(\hat{L}g\left(x\right)\right)\text{d}x. \end{aligned}\]

Therefore the operator \(\hat{L}\) is Hermitian provided the boundary term vanishes:

\[\left[p\left(x\right)\left(f^{*}\left(x\right)g'\left(x\right)-f'^{*}\left(x\right)g\left(x\right)\right)\right]^{b}_{a}=0.\]

This can be guaranteed by mandating that all functions under consideration obey the separated boundary conditions

\[\begin{aligned} c_{1}y\left(a\right)+c_{2}y'\left(a\right) & =0\\ d_{1}y\left(b\right)+d_{2}y'\left(b\right) & =0 \end{aligned} \tag{17.4}\]

where

\[\begin{aligned} \left(c_{1},c_{2}\right) & \ne\left(0,0\right)\\ \left(d_{1},d_{2}\right) & \ne\left(0,0\right). \end{aligned}\]

To check this, note that Equation 17.4 can be rewritten

\[\begin{aligned} y'\left(a\right) & =-\frac{c_{2}}{c_{1}}y\left(a\right)\\ y'\left(b\right) & =-\frac{d_{2}}{d_{1}}y\left(b\right) \end{aligned}\]

(which is well defined because of the constraint that both terms are not simultaneously zero), and so the boundary term becomes

\[\begin{aligned} & \left[p\left(b\right)\left(f^{*}\left(b\right)g'\left(b\right)-f'^{*}\left(b\right)g\left(b\right)\right)\right]-\left[p\left(a\right)\left(f^{*}\left(a\right)g'\left(a\right)-f'^{*}\left(a\right)g\left(a\right)\right)\right]\\ = & \left[\frac{d_{2}}{d_{1}}p\left(b\right)\left(-f^{*}\left(b\right)g\left(b\right)+f{}^{*}\left(b\right)g\left(b\right)\right)\right]-\left[\frac{c_{2}}{c_{1}}p\left(a\right)\left(f^{*}\left(a\right)g\left(a\right)-f{}^{*}\left(a\right)g\left(a\right)\right)\right] \end{aligned}\]

and so each boundary vanishes separately.

If \(p\left(a\right)=p\left(b\right)\), one can instead guarantee periodic solutions

\[\begin{aligned} f\left(a\right) & =f\left(b\right)\\ f'\left(a\right) & =f'\left(b\right). \end{aligned}\]