The transpose of a matrix turns the matrix sideways. Suppose A is an m × n matrix with real number entries. Then the transpose Aᵀ is an n × m matrix, and the (i, j) element of A is the (j, i) element of Aᵀ. Very concrete.
The adjoint of a linear operator is a more abstract concept, though it’s closely related. The matrix Aᵀ is sometimes called the adjoint of A. That may be fine, or it may cause confusion. This post will define the adjoint in a more general context, then come back to the context of matrices.
This post, and the next will be more abstract than usual. After indulging in a little pure math, I’ll return soon to more tangible topics such as Morse code and barbershop quartets.
Dual spaces
Before we can define adjoints, we need to define dual spaces.
Let V be a vector space over a field F. You can think of F as ℝ or ℂ. Then V* is the dual space of V, the space of linear functionals on V, i.e. the vector space of functions from V to F.
The distinction between a vector space and its dual seems artificial when the vector space is ℝn. The dual space of ℝn is isomorphic to ℝn, and so the distinction between them can seem pedantic. It’s easier to appreciate the distinction between V and V* when the two spaces are not isomorphic.
For example, let V be L3(ℝ), the space of functions f such that |f|3 has a finite Lebesgue integral. Then the dual space is L3/2(ℝ). The difference between these spaces is not simply a matter of designation. There are functions f such that the integral of |f|3 is finite but the integral of |f|3/2 is not, and vice versa.
Adjoints
The adjoint of a linear operator T: V → W is a linear operator T*: W* → V* where V* and W* are the dual spaces of V and W respectively. So T* takes a linear function from W to the field F, and returns a function from V to F. How does T* do this?
Given an element w* of W*, T*w* takes a vector v in V and maps it to F by
(T*w*)(v) = w*(Tv).
In other words, T* takes a functional w* on W and turns it into a function on V by mapping elements of V over to W and then letting w* act on them.
Note what this definition does not contain. There is no mention of inner products or bases or matrices.
The definition is valid over vector spaces that might not have an inner product or a basis. And this is not just a matter of perspective. It’s not as if our space has a inner product but we choose to ignore it; we might be working over spaces where it is not possible to define an inner product or a basis, such as ℓ∞, the space of bounded sequences.
Since a matrix represents a linear operator with respect to some basis, you can’t speak of a matrix representation of an operator on a space with no basis.
Bracket notation
For a vector space V over a field F, denote a function ⟨ ·, · ⟩ that takes an element from V and an element from V* and returns an element of F by applying the latter to the former. That is, ⟨ v, v* ⟩ is defined to be the action of v* on v. This is not an inner product, but the notation is meant to suggest a connection to inner products.
With this notation, we have
⟨ Tv, w* ⟩W = ⟨ v, T*w* ⟩V
for all v in V and for all w in W by definition. This is the definition of T* in different notation. The subscripts on the brackets are meant to remind us that the left side of the equation is an element of F obtained by applying an element of W* to an element of W, while the right side is an element of F obtained by applying an element of V* to an element of V.
Inner products
The development of adjoint above emphasized that there is not necessarily an inner product in sight. But if there are inner products on V and W, then we can define turn an element of v into an element of V* by associating v with ⟨ ·, v ⟩ where now the brackets do denote an inner product.
Now we can write the definition of adjoint as
⟨ Tv, w ⟩W = ⟨ v, T*w ⟩V
for all v in V and for all w in W. This definition is legitimate, but it’s not natural in the technical sense that it depends on our choices of inner products and not just on the operator T and the spaces V and W. If we chose different inner products on V and W then the definition of T* changes as well.
Back to matrices
We have defined the adjoint of a linear operator in a very general setting where there may be no matrices in sight. But now let’s look at the case of T: V → W where V and W are finite dimensional vector spaces, either over ℝ or ℂ. (The difference between ℝ and ℂ will matter.) And lets definite inner products on V and W. This is always possible because they are finite dimensional.
How does a matrix representation of T* correspond to a matrix representation of T?
Real vector spaces
Suppose V and W are real vector spaces and A is a matrix representation of T: V → W with respect to some choice of basis on each space. Suppose also that the bases for V* and W* are given by the duals of the bases for V and W. Then the matrix representation of T* is the transpose of A. You can verify this by showing that
⟨ Av, w ⟩W = ⟨ v, Aᵀw ⟩V
for all v in V and for all w in W.
The adjoint of A is simply the transpose of A, subject to our choice of bases and inner products.
Complex vector spaces
Now consider the case where V and W are vector spaces over the complex numbers. Everything works as above, with one wrinkle. If A is the representation of T: V → W with respect to a given basis, and we choose bases for V* and W* as above, then the conjugate of Aᵀ is the matrix representation of T*. The adjoint of A is A*, the conjugate transpose of A. As before, you can verify this by showing that
⟨ Av, w ⟩W = ⟨ v, A*w ⟩V
We have to take the conjugate of Aᵀ because the inner product in the complex case requires taking a conjugate of one side.