This post brings together several themes I’ve been writing about lately: caching function evaluations, error estimation, and Runge-Kutta methods.
A few days ago I wrote about how Runge-Kutta methods can all be summarized by a set of numbers called the Butcher tableau. These methods solve
by evaluating f at some partial step, then evaluating f again at some new partial step and some linear combination of values from previous steps, etc. In the preface to J. C. Butcher’s book on differential equations, JM Sanz-Serna describes this structure as follows.
Runge-Kutta schemes … are highly nonlinear with a remarkable Matrioshka doll structure, where the vector field has to be evaluated an expression that involves the vector field evaluated at and expression that involves the vector field …
Once all the “Matrioshka dolls” are lined up, i.e. all the intermediate results have been calculated, the final estimate is a linear combination of these values.
Here’s the clever idea behind adaptive solvers: Create two Runge-Kutta methods of different orders that depend on the same intermediate results. Then both can be computed without new function evaluations, and the results compared. The difference between the results can be used as an estimate of the local error. Then you can adjust your step size accordingly and try again.
I’m going to present two adaptive Runge-Kutta schemes. I’ll go over Fehlberg’s method in this post and a variation in the next post that has some advantages over Fehlberg’s method.
Runge-Kutta-Felhberg (RKF45)
Fehlberg’s method, commonly known as RKF45, starts with a six-stage Runge Kutta method whose coefficients are given by the following tableau.
The meaning of the tableau is described here. (Imagine the effort it took Erwin Felhberg to derive this in 1969. Presumably he had little more than a desktop calculator to help. Maybe not even that.)
What’s important about the tableau for this post is that the coefficients above the horizontal line are used to create six numbers, the k‘s in the notation of the post referenced above. The k‘s are multiplied by the coefficients below the horizontal line to produce the solution of the differential equation at the next step. This is the “4” of RKF45.
RKF45 then applies another method which reuses all the k‘s in a different linear combination. This is summarized in the following variation of the Butcher tableau.
This is essentially two tableau combined into one. The first is as above. The second is like the one above but with a different bottom row. The bottom row gives the coefficients for a Runge-Kutta method corresponding to the “5” part of RKF45.
The 4 stands for 4th order, i.e. the local error for a step size h is O(h4). The 5 stands for 5th order, i.e. local error O(h5). RKF45 is two different methods, but they share so much computation that the second one almost comes for free; it does not require any new function evaluations, only taking a linear combination of six numbers.
However, RK45 is only “free” if you’ve gone to the effort of using a six-stage method. The amount of computation is roughly proportional to the number of stages, so we do about 50% more work to have RKF45 with an error estimate than doing the most common 4th order RK method. So if you knew exactly what step size to use, basic RK would be more efficient. But how could you know the optimal step size a priori?
By guiding us to choose the right step size, the extra work in RKF45 more than pays for itself. It could save a lot of computation that would come from using too small a step size, or prevent inaccurate results due to using too large a step size. Or maybe both: maybe a differential equation needs small steps at one period of time and can use larger steps at another period of time.
For more information on Felhberg’s method, see Solving Differential Equations by Hairer et al.