Someone asked me on Twitter
Is there a trick to make an singular (non-invertible) matrix invertible?
The only response I could think of in less than 140 characters was
Depends on what you’re trying to accomplish.
Here I’ll give a longer explanation.
So, can you change a singular matrix just a little to make it non-singular? Yes, and in fact you can change it by as little as you’d like. Singular square matrices are an infinitely thin subset of the space of all square matrices, and any tiny random perturbation is almost certain to take you out of that thin subset. (“Almost certain” is actually a technical term here, meaning with probability 1.)
Adding a tiny bit of noise to a singular matrix makes it non-singular. But why did you want a non-singular matrix? Presumably to solve some system of equations. A little noise makes the system go from theoretically impossible to solve to theoretically possible to solve, but that may not be very useful in practice. It will let you compute a solution, but that solution may be meaningless.
A little noise makes the determinant go from zero to near zero. But here is an important rule in numerical computation:
If something is impossible at 0, it will be hard near zero.
“Hard” could mean difficult to do or it could mean the solution is ill-behaved.
Instead of asking whether a matrix is invertible, let’s ask how easy it is to invert. That is, let’s change a yes/no question into a question with a continuum of answers. The condition number of a matrix measures how easy the matrix is to invert. A matrix that is easy to invert has a small condition number. The harder it is to invert a matrix, the larger its condition number. A singular matrix is infinitely hard to invert, and so it has infinite condition number. A small perturbation of a singular matrix is non-singular, but the condition number will be large.
So what exactly is a condition number? And what do I mean by saying a matrix is “hard” to invert?
The condition number of a matrix is the norm of the matrix times the norm of its inverse. Defining matrix norms would take too long to go into here, but intuitively it is a way of sizing up a matrix. Note that you take the norm of both the matrix and its inverse. A small change to a matrix might not change its norm much, but it might change the norm of its inverse a great deal. If that is the case, the matrix is called ill-conditioned because it has a large condition number.
When I say a matrix is hard to invert, I mean it is hard to do accurately. You can use the same number of steps to invert a matrix by Gaussian elimination whether the matrix has a small condition number or a large condition number. But if the matrix has a large condition number, the result may be very inaccurate. If the condition number is large enough, the solution may be so inaccurate as to be completely useless.
You can think of condition number as an error multiplier. When you solve the linear system Ax = b, you might expect that a little error in b would result in only a little error in x. And that’s true if A is well-conditioned. But a small change in b could result in a large change in x if A is ill-conditioned. If the condition number of A is 1000, for example, the error in x will be 1000 times greater than the error in b. So you would expect x to have three fewer significant figures than b.
Note that condition number isn’t limited to loss of precision due to floating point arithmetic. If you have some error in your input b, say due to measurement error, then you will have some corresponding error in your solution to Ax = b, even if you solve the system Ax = b exactly. If the matrix A is ill-conditioned, any error in b (rounding error, measurement error, statistical uncertainty, etc.) will result in a correspondingly much larger error in x.
So back to the original question: can you make a singular matrix non-singular? Sure. In fact, it’s hard not to. Breathe on a singular matrix and it becomes non-singular. But the resulting non-singular matrix may be useless. The perturbed matrix may be theoretically invertible but practically non-invertible.
So here’s a more refined question: can you change a singular matrix into a useful non-singular matrix? Yes you can, sometimes, but the answer depends very much on the problem you’re trying to solve. This is generally known as regularization, though it goes by different names in different contexts. You use knowledge of your domain beyond the specific data at hand to guide how you change your matrix.
Another way to look at the condition number: it is the ratio of the largest-magnitude eigenvalue of the matrix to its smallest-magnitude eigenvalue. For a singular matrix, the latter is zero: this means that a perturbation that will increase the smallest-magnitude eigenvalue will reduce the condition number, yielding an equation system that’s easier to solve, per John’s discussion.
How can you change the smallest-magnitude eigenvalue without changing the other eigenvalues too much? Well, let us assume that the magnitude of the eigenvalues of the matrix cover a range of multiple orders of magnitude; in other terms, the largest-magnitude eigenvalue is a few orders of magnitude larger than zero. Applying a diagonal perturbation corresponding to lambda times the identity will increase ALL eigenvalues by lambda. If the largest-magnitude eigenvalue is as large as assumed, then that eigenvalue is not significantly affected by the change; on the other hand, the smallest-magnitude eigenvalue has become lambda, so the condition number of the perturbed matrix is (largest eigenvalue / lambda). Now, the effect of this perturbation depends on the distribution of eigenvalues: if the majority of the eigenvalues are similar to the largest, the system is not much disturbed. On the other hand, if most eigenvalues are close to zero, the system is not quite what it was anymore.
In the context of symmetrical semi-definite matrices (decomposable as the product of the transposition of some matrix to itself), this multiple-identity perturbation is called Tikhonov regularization.
Interesting. So if you could (let’s say with a theorem) perturb a singular matrix in such a way as to minimally perturb the norm of the inverse — would that be “good” in some way? Is that in fact possible?
Benoit, great comment.
Re: human mathematics
Your question actually raises an unbounded problem :-). The “minimal” perturbation of the matrix is zero. John’s point is that there are many ways to perturb a matrix to take it from singular to non-singular, but this is often not enough. You have to make it “sufficiently non-singular” to compute, at least, a product of its inverse to some right-hand-side vector, otherwise this computation becomes imprecise.
In other words, if you ask what is $A^{-1} y$ for singular $A$, it makes as much sense as asking what 5/0 is. Now, we can agree that we can nudge that zero by as little as you want to get a question that makes sense: 5/x, for non-zero x. The answer to that question, then, depends on x, and you can examine the variation of the answer as you make x ever smaller. The same goes when dealing with a singular matrix: how does the result of $(A+P)^{-1} y$ varies as you make $P$ ever smaller? In terms of my eigenvalue anaysis, how does $(A+lambda I)^{-1} y$ varies as you make $lambda$ ever smaller? Can you extrapolate something useful? That as close as you can get to answering the original question about the singular matrix.
Great post.