Compressing data w/ sample points & polynomial interpolation

Data compression is everywhere. We’re unaware of it when it is done well. We only become aware of it when it is pushed too far, such as when a photo looks grainy or fuzzy because it was compressed too much.

The basic idea of data compression is to not transmit the raw data but to transmit some of the data along with instructions for how to approximately reconstruct the rest [1].

Fifty years ago scientists were concerned with a different application of compression: reducing the size of mathematical tables. Books of tabulated functions are obsolete now, but the principles used in producing these tables are still very much relevant. We use compression and interpolation far more often now, though it’s almost always invisibly executed by software.

Compressing tables

In this post I want to expand on comments by Forman Acton from his book Numerical Methods That Work on compression.

Many persons are unaware of the considerable compression in a table that even the use of quadratic interpolation permits. A table of sin x covering the first quadrant, for example, requires 541 pages if it is to be linearly interpolable to eight decimal places. If quadratic interpolation is used, the same table takes only one page having entries at one-degree intervals with functions of the first and second differences being recorded together with the sine itself.

Acton goes on to mention the advantage of condensing shelf space by a factor of 500. We no longer care about saving shelf space, but we may care very much about saving memory in an embedded device.

Quadratic interpolation does allow more compression than linear interpolation, but not by a factor of 500. I admire Acton’s numerical methods book, but I’m afraid he got this one wrong.

Interpolation error bound

In order to test Acton’s claim we will need the following theorem on interpolation error [2].

Let f be a function so that f⁽ⁿ⁺¹⁾ is continuous on [a, b] and satisfies |f⁽ⁿ⁺¹⁾ (x)| ≤ M. Let p be the polynomial of degree ≤ n that interpolates f at n + 1 equally spaced nodes in [a, b], including the end points. Then on [a, b],

$|f(x) - p(x)| \leq \frac{1}{4(n+1)} M \left(\frac{b-a}{n}\right)^{n+1}$

Quadratic interpolation error

Acton claims that quadratic interpolation at intervals of one degree is adequate to produce eight decimal places of accuracy. Quadratic interpolation means n = 2.

We have our function tabulated at evenly spaced points a distance h = π/180 radians apart. Quadratic interpolation requires function values at three points, so b − a = 2h = π/90. The third derivative of sine is negative cosine, so M = 1.

This gives an error bound of 4.43 × 10⁻⁷, so this would give slightly better than six decimal place accuracy, not eight.

Linear interpolation error

Suppose we wanted to create a table of sine values so that linear interpolation would give results accurate to eight decimal places.
In the interpolation error formula we have M = 1 as before, and now n = 1. We would need to tabulate sine at enough points that h = b − a is small enough that the error is less than 5 × 10⁻⁹. It follows that h = 0.0002 radians. Covering a range of π/2 radians in increments of 0.0002 radians would require 7854 function values. Acton implicitly assumes 90 values to a page, so this would take about 87 pages.

Abramowitz and Stegun devotes 32 pages to tabulating sine and cosine at increments of 0.001 radian. This does not always guarantee eight decimal place accuracy using linear interpolation, but it does guarantee at least seven places (more on that here), which is better than a table at one degree increments would deliver using quadratic interpolation. So it would have been more accurate for Acton to say quadratic interpolation reduces the number of pages by a factor of 30 rather than 500.

Cubic interpolation error

If we have a table of sine values at one degree increments, how much accuracy could we get using cubic interpolation? In that case we’d apply the interpolation error theorem with n = 3 and b − a = 3(π/180) = π/60. Then the error bound is 5.8 × 10⁻⁹. This would usually give you eight decimal place accuracy, so perhaps Acton carried out the calculation for cubic interpolation rather than quadratic interpolation.

[1] This is what’s known as lossy compression; some information is lost in the compression process. Lossless compression also replaces the original data with a description that can be used to reproduce the data, but in this case the reconstruction process is perfect.

[2] Ward Cheney and David Kincaid. Numerical Methods and Computation. Third edition.

Compression and interpolation

Compressing tables

Interpolation error bound

Quadratic interpolation error

Linear interpolation error

Cubic interpolation error

Related posts

One thought on “Compression and interpolation”