Test functions are how you can make sense of functions that aren’t really functions.
The canonical example is the Dirac delta “function” that is infinite at the origin, zero everywhere else, and integrates to 1. That description is contradictory: a function that is 0 almost everywhere integrates to 0, even if you work in extended real numbers where a function can take on the value ∞.
You can make things like the delta function rigorous by saying they’re not functions of real numbers, but functions that operate on other functions, i.e. test functions. More on that here. These functions acting on test functions are called generalized functions or distributions. (This this post for how this kind of distribution differs from a probability distribution, but is analogous.)
Analogy with test charges
To say rigorously how these generalized functions behave you show how they act on test functions φ. Test functions are analogous to test charges: you can describe an electric field by saying what force it would exert on a test charge.
A test charge is somewhat idealized. It has to be so small that it tests a field without effecting it. This isn’t really possible, but you can think of it as a limit. You’re looking at the limit of the force per unit charge as the charge goes to zero.
Similarly, a test function is ideal in that it very well behaved so the generalized functions that act on it can be badly behaved. Test functions are infinitely differentiable, and they either have compact support or have extremely thin tails, depending on context.
Analogy with category theory
While writing the previous post I thought about an analogy between distribution theory and category theory. I worked with distribution theory a lot in grad school, and I found it natural that the definition of a distribution depended on how it acted in relation to something else, i.e. how it acted on all test functions.
But I found category definitions that involved extraneous objects puzzling. For example, the product of two objects is a third object such that for any fourth object (!) a certain diagram commutes. Why is this superfluous object doing injecting itself into the definition? If I’d thought of it as a test object then I would have found the definition more palatable.
As with distribution theory, you’re defining something by how it relates to all elements of some collection. But in distribution theory, your distributions and your test functions are very distinct things. In category theory, your test objects are peers of the thing you’re testing.