If you take a square matrix M, subtract x from the elements on the diagonal, and take the determinant, you get a polynomial in x called the characteristic polynomial of M. For example, let
Then
The characteristic equation is the equation that sets the characteristic polynomial to zero. The roots of this polynomial are eigenvalues of the matrix.
The Cayley-Hamilton theorem says that if you take the original matrix and stick it into the polynomial, you’ll get the zero matrix.
In brief, a matrix satisfies its own characteristic equation. Note that for this to hold we interpret constants, like 12 and 0, as corresponding multiples of the identity matrix.
You could verify the Cayley-Hamilton theorem in Python using scipy.linalg.funm
to compute a polynomial function of a matrix.
>>> from scipy import array >>> from scipy.linalg import funm >>> m = array([[5, -2], [1, 2]]) >>> funm(m, lambda x: x**2 - 7*x + 12)
This returns a zero matrix.
I imagine funm
is factoring M into something like PDP−1 where D is a diagonal matrix. Then
f(M) = P f(D) P−1.
This is because f can be applied to a diagonal matrix by simply applying f to each diagonal entry independently. You could use this to prove the Cayley-Hamilton theorem for diagonalizable matrices.
Nice…
The dot product function seems to use integers when they can (you get zero exactly).
>>> np.dot(m,m)-7*m+12*np.eye(2,dtype=int)
If needed, one could rewrite a version of funm for polynomials using dot products.
“I imagine funm is factoring M into PDP^{-1} where D is a diagonal matrix”
A good guess. The documentation for funm (https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.funm.html) says it uses Schur decomposition, with a triangular matrix in place of your D. Further cleverness is required to efficiently apply f to the triangular matrix. The details are in Golub and Van Loan “Matrix Computations”, section 9.1.4 (4th ed).
Also possible: m@m – 7*m + 12*np.eye(2). The function evaluating a polynomial P = [a0, …, an] = a0 + a1*X + …+an*X**n for a matrix could be
def matpol(P,M):
Mk=np.eye(np.ndim(M));
S=P[0]*Mk
for k in range(1,len(P)):
Mk = Mk @ M
S+=P[k]*Mk
return S
With this, matpol([12,-7,1],m) yields the zero array.
oops, “np.ndim(M)” should rather be M.shape[0].