On Making Databases Run Faster

Database  technology is a mature field, and techniques for optimizing databases are well understood. However, surprises can still happen.

Certain performance optimizations you might expect to be automatic are not really. I’m working with a legacy code developed some time ago, before modern notions of separation of concerns between code business logic and data storage. The code runs slower than you’d expect, and some have wondered as to why this is.

Profiling of the code revealed that the slowdown was not in the core computation, but rather in the reading and writing of the backend database, which occurs frequently when this code executes.

My first thought was to run with the database in RAM disk, which would give higher bandwidth and lower latency than spinning disk or SSD. This helped a little, but not much.

As a short term fix I ended up writing code for (in-memory) hash tables as an interposer between the code and the database. This can cache commonly accessed values and thus reduce database access.

I would’ve thought high-speed RAM caching of values would be default behavior for a database manager. A principle of interface design is to make the defaults as useful as possible. But in this case apparently not.

Thankfully my fix gave over 10X speedup in application launch time and 6X speedup in the core operation of the code.

The project team is moving toward SQLite for database management in the near future. SQLite has perhaps a dozen or so available methods for optimizations like this. However early experiments with SQLite for this case show that more fundamental structural code modifications will also be needed, to improve database access patterns.

As with general code optimization, sometimes you’d be inclined to think the system (compiler, database manager, etc.) will “do it all for you.” But often not.

Duplicating a hand-drawn contour plot

Like the previous post, this post also seeks to approximately reproduce a hand-drawn plot. This time the goal is reproduce figure 7.3 from A&S page 298.

This plot is a visualizing of the function of a complex variable

w(z) = exp(−z²) erfc(− iz)

where erfc is the complementary error function.

A&S calls the graph above an “altitude chart.” This could be a little misleading since it’s the overlay of two plots. One plot is the absolute value of w(z), which could well be called altitude. But it also contains a plot of the phase. To put it another way, if we denote the values of the function in polar form r exp(iθ) the altitude chart is an overlay of a plot of r and a plot of θ.

We begin by defining

f[z_] := Exp[-z^2] Erfc[-I z]

The following code reproduces the lines of constant phase fairly well.

ContourPlot[Arg[f[x + I y]], {x, 0.1, 3.1}, {y, -2.5, 3}, 
    Contours -> 20, ContourShading -> None, AspectRatio -> 1.6]

The lines of constant absolute value take a little more effort to reproduce. If we let Mathematica pick where to put the contour lines, they will not be distributed the same way they were in A&S.

ContourPlot[Abs[f[x + I y]], {x, 0, 3.1}, {y, -2.6, 3}, 
    Contours -> 20, ContourShading -> None, AspectRatio -> 1.6]

We can duplicated the spacing in the original plot by providing Mathematica a list of contour values rather than number of contour values.

ContourPlot[Abs[f[x + I y]], {x, 0, 3.1}, {y, -2.6, 3}, 
    Contours -> {0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 3, 4, 5, 10, 100}, 
    ContourShading -> None, AspectRatio -> 1.6]

(For reasons I don’t understand, Mathematica does not draw the contours corresponding to w = 10 and w = 100.)

When I overlay the phase and absolute value plots with the Show command I get a plot approximately reproducing the original.

Related posts

Reproducing a hand-drawn plot

The plots in old (i.e. pre-computer) math books are impressive. These plots took a lot of effort to produce, and so they weren’t created lightly. Consequently they tend to be aesthetically and mathematically interesting. A few weeks ago I recreated a plot from A&S using Mathematica, and today I’d like to do the same for the plot below from a different source [1].

Here is my approximate reproduction.

I’ll give the mathematical details below for those who are interested.

The plot shows “normalized associated Legendre functions of the first kind.” There are two families of Legendre polynomials, denoted P and Q; we’re interested in the former. “Associated” means polynomials that are derived the Legendre polynomials by taking derivatives. Normalized means the polynomials are multiplied by constants so that their squares integrate to 1 over [−1, 1].

Mathematica has a function LegendreP[n, m, x] that implements the associated Legendre polynomials Pnm(x). I didn’t see that Mathematica has a function for the normalized version of these functions, so I rolled by own.

f[n_, m_, x_] := (-1)^n LegendreP[n, m, x] 
        Sqrt[(2 n + 1) Factorial[n - m]/(2 Factorial[n + m])]

I added the alternating sign term up front after discovering that apparently the original plot used a different convention for defining Pnm than Mathematica uses.

I make my plot by stacking the plots created by

Plot[Table[f[n, n, x], {n, 1, 8}],  {x, 0, 1}]

and

Plot[Table[f[n + 4, n, x], {n, 0, 4}],  {x, 0, 1}]

The original plot shows P4(x). I used the fact that this equals P40(x) to simplify the code. I also left out the flat line plotting P0 because I thought that looked better.

Related post: Duplicating a Hankel function plot from A&S.

[1] Tables Of Functions With Formulae And Curves by Fritz Emde, published in 1945. Available on Archive.org.

What is partial pivoting?

Gaussian elimination is pretty simple if all goes well, and in introductory courses all does go well.

You have an n × n matrix A, an n × 1 column vector b, and you want to find an n × 1 column vector x such that Ax = b.

You subtract multiples of the first rows of A and b to zero out all the elements below a11. Then you subtract multiples of the second rows of A and b to zero out all the elements below a22. You keep doing this until all the elements below the diagonal are zero. Then you can solve for x using back substitution.

The process above is called naive Gaussian elimination. This post is for people who are familiar with naive Gaussian elimination but who do not realize it is naive.

Creating zeros

Suppose the first two rows of our matrix A are as follows:

2 * * * …
3 * * * …

Then you subtract 3/2 times the new first row from the second:

2 * * * …
0 * * * …

You would continue to subtract multiples of the first row from the rest of the rows.

What could go wrong?

What if the first two rows looked like this?

0 * * * …
3 * * * …

In this case naive Gaussian elimination fails on the first step. As far as I recall, I never saw an example like that when I took linear algebra: naive Gaussian elimination always worked. I don’t think I ever ran into this possibility until I took a course in numerical analysis.

The simplest way around the problem would be to swap the first two rows:

3 * * * …
0 * * * …

I imagine this strategy was commonly done in hand calculations long before anyone gave the process a name. I said that I don’t remember encountering this problem in a linear algebra class. Maybe it came up in a homework problem and I swapped rows without thinking about it. But if you’re going to write software to implement Gaussian elimination, you have to think about these things systematically.

You might have a program literally swap rows, or if you’re more performance-conscious you might conceptually swap the rows, keeping track of the swaps with an index but without moving data in memory.

Pivots and pivoting

The first element in the first row is called a pivot. More generally, the akk element at the kth step of Gaussian elimination is a pivot. When you run into a zero pivot, you have a logical problem: naive Gaussian elimination cannot proceed without some help, i.e. swapping rows. Less obvious but still true is that if you run into a small pivot, you have a numerical problem. When carrying out calculations on a computer with floating point arithmetic, small but non-zero pivots can lead to loss of accuracy, possibly catastrophic loss.

Swapping rows to avoid small pivots is called partial pivoting. Partial pivoting is usually [1] adequate in practice. Complete pivoting allows the possibility of swapping columns as well. You can go down a deep rabbit hole exploring the nuances of various pivoting strategies.

Gaussian elimination is part of the high school curriculum, but there are a lot of practical issues to work out when applying this apparently simple algorithm. As numerical analysis expert Nick Trefethen put it, “The closer one looks, the more subtle and remarkable Gaussian elimination appears.”

Related posts

[1] “Despite [artificially constructed examples] Gaussian elimination with partial pivoting is utterly stable in practice. … In fifty years of computing, no matrix problems that create explosive instability are known to have arisen under natural circumstances.” — Nick Trefethen and David Bau. Numerical Linear Algebra. 2022.

Max and min orbital speed

An earlier post needed to calculate how much the speed of a planet varies in orbit. The planet moves fastest as perihelion, the point in its orbit closes to the sun, and it moves slowest at aphelion, when it is furthest from the sun.

The ratio of the maximum to minimum speed turns out to be a simple expression in terms of the eccentricity e of the orbit:

(1 + e)/(1 − e).

You can derive this fairly quickly from the vis-viva equation, which in turn is derived from conservation of energy.

There are several things I find interesting about this. First, that the expression is so simple. Second, it can be simplified even more for small e:

(1 + e)/(1 − e) ≈ 1 + 2e.

This comes from expanding the ratio as a series:

(1 + e)/(1 − e) = 1 + 2e + 2e² + 2e³ …

This explains two things from the previous post. First, that the variation in orbital speed, for both Earth and Mars, worked out to be about 2e. The eccentricity of Earth’s orbit is 0.0167 and orbital speed varies by about 3%. Mars’ orbit has eccentricity 0.0934 and its orbital speed varies by about 19%. Since the eccentricity of Mars orbit, while fairly small, is larger than that of Earth, the quadratic term matters more for Mars.

Finally, “I keep running into the function f(z) = (1 − z)/(1 + z),” as I first wrote four years ago, and wrote on again a few months ago. It comes up, for example, in computing impedance, in mental calculation tricks, and in efficient calculation of the perimeter of an ellipse. Now you can add to that list calculating the variation in orbital speed in a two body problem.

Martian Leap Years

The previous post looked at one challenge with designing a calendar for Mars, namely how to vary the number of days per month so that each month corresponds to a change of 30 degrees with respect to the sun. This is a bigger problem on Mars than here on Earth.

That post assumed that a Martian year consisted of 669 sols (Martian days). But a Martin year is not an integer number of sols, just as an Earth year is not an integer number of days. In fact, a Martian year is 668.5991 sols.

One way to handle this would be for 3 out of every 5 years to have 669 sols, with the remaining years having 668 sols. So maybe if the year mod 5 equals 0, 1 or 2 the year would be long, 669 sols, and otherwise it would be short, 668 sols. (This alternation of 3 and 2 reminds me of Dave Brubeck’s song Take Five.)

This scheme, which approximates 668.5991 by 668.6, is about as accurate as our Gregorian calendar, which approximates 365.2422 by 365.2425. For more accuracy, Martian’s would need something like our leap seconds.

A calendar for Mars

I recently started reading The Case for Mars by Robert Zubrin. This post will unpack one line from that book regarding creating a calendar for Mars:

Equipartitioned months don’t work for Mars, because the planet’s orbit is elliptical, which causes the seasons to be of unequal length.

This sentence doesn’t sit well at first for a couple reasons. First, Earth’s orbit is elliptical too, and the seasons here are of roughly equal length. Second, the orbit of Mars, like the orbit of Earth, is nearly circular.

There are three reasons why Zubrin’s statement is correct, despite the objections above. The first has to do with the nature of eccentricity, and the second with the reference to which angles are measured, and the third with variable speed.

Eccentricity

The orbit of Mars is about five and a half times as eccentric as that of Earth. That does not mean that the orbit of Mars is noticeably different from a circle, but it does mean the sun is noticeably not at the center of that (almost) circle.

There’s a kind of paradox interpreting eccentricity e. An ellipse with e = 0 is a circle, and the two foci of the ellipse coincide with the center of the circle. As e increases, the ellipse aspect ratio increases and the foci move apart. But here’s the key: the aspect ratio doesn’t change nearly as fast as the distance between the two foci changes. I’ve written more about this here and here.

So while the orbit of Mars is nearly a circle, the sun is substantially far from the center of the orbit. We can visualize this with a couple plots. First, here are the orbits of Earth and Mars, shifted so that both have their center at the origin.

Both are visually indistinguishable from circles.

How here are the two orbits with their correct placement relative to the sun at the center.

Angle reference

Zubrin writes

In order to predict the seasons, a calendar must divite the planet’s orbit not into equal division of days , but into equal angles of travel  around the sun. … a month is really 30 degrees of travel around the Sun.

If we were to divide the orbit of Mars into partitions of 30 degrees relative to the center of the orbit then each month would be about the same length. But Zubrin is dividing the orbit into partitions of 30 degrees relative to the sun.

In the language of orbital mechanics, Zubrin’s months correspond to 30 degrees of true anomaly, not 30 degrees of mean anomaly. I explain the difference between true anomaly and mean anomaly here. That post shows that for Earth, true anomaly and mean anomaly never differ by more than 2 degrees. But for Mars, the two anomalies can differ by up to almost 19 degrees.

Variable speed

A planet in an elliptical orbit around the sun travels fastest when it is nearest the sun and slowest when it is furthest from the sun. Because Mars’s orbit is more eccentric than Earth’s, the variation in orbital speed is greater. We can calculate the ratio of the fastest speed to the slowest speed using the vis-viva equation. It works to be

(1 + e)/(1 − e).

For Earth, with eccentricity 0.0167 this ratio is about 1.03, i.e. orbital speed varies by about 3%.

For Mars, with eccentricity 0.0934 this ratio is about 1.21, i.e. orbital speed varies by about 21%.

Zubrin’s months

The time it takes for Mars to rotate on its axis is commonly called a sol, a Martian day. The Martian year is 669 sols. Zubrin’s proposed months range from 46 to 66 sols, each corresponding to 30 degrees difference in true anomaly.

Related posts

Code Profiling Without a Profiler

Making your code to run faster starts with understanding where in the code the runtime is actually spent. But suppose, for whatever reason, the code profiling tools won’t work?

I recently used MS Visual Studio on a legacy C++ code. The code crashed shortly after startup when attempting to profile, though otherwise the code ran fine for both release and debug build targets. The cause of the problem was not immediately visible.

If all else fails, using manual timers can help. The idea is to find a high-accuracy system wallclock timer function and use this to read the time before and after some part of the code you want to time. One can essentially apply “bisection search” to the code base to look for the code hot spots. See below for an example.

This can be useful in various situations. Codes in complex languages (or even mixed languages in the code base) can have unusual constructs that break debuggers or profilers. Also, exotic hardware like embedded systems, GPUs or FPGAs may lack full profiler support. Additionally, brand new hardware releases often lack mature tool support, at least initially.

Furthermore, profiling tools themselves, though helpful for getting a quick snapshot of the performance breakdown of each function in the code, have their own limitations. Profilers work either by instrumenting the executable code or sampling. Instrumenting can cause timing inaccuracies by adding overhead from calling the system timer on entrance and exit to every function called. Also it breaks function inlining, often reducing performance.

Sampling on the other hand can be inaccurate if the sample rate is too low, or can distort runtime when sampling at too high a frequency. In contrast, manual timers can circumvent these problems by a very surgical application to specific parts of the code (though some profilers let you turn the profiler on and off at different parts of the code).

Resorting to manual timing of code sections is a messy business. But sometimes it’s the only thing that will work.

Visual Studio C++ Code Example

// mycode.h

#include "cstdio"
#include "cstdarg"

// Get time of day - elapsed seconds
static double gtod() {   
    LARGE_INTEGER ctr, freq;
    QueryPerformanceFrequency(&freq);
    QueryPerformanceCounter(&ctr);
    return static_cast(ctr.QuadPart) / static_cast(freq.QuadPart);
}   
    
// Convenience function for printing to file
static void FilePrintf(const char* format, ...) {   
    char buffer[1024];
    va_list args;
    va_start(args, format);
    vsnprintf(buffer, sizeof(buffer), format, args);
    va_end(args);
    FILE* myoutfile = fopen("mytimingsfile.txt", "a");
    fprintf(myoutfile, "%s", buffer);
    fclose(myoutfile);
}   
    
// Storage for timer
extern double g_timer;

// mycode.cpp

#include "mycode.h"

// Initialization for timer
double g_timer = 0;

int main() {

    // ...
    g_timer = 0;

    for (int i=0; i<n; ++i) {
        // ...
        const double t1 = gtod();
        my_expensive_function();
        g_timer += gtod() - t1;
        // ...
    }

    FilePrintf("my_expensive_function runtime: %.6f seconds.\n", g_timer);
    g_timer = 0;

    // ...

Do perimeter and area determine a triangle?

Is the shape of a triangle determined by its perimeter and area? In other words, if two triangles have the same area and the same perimeter, are the triangles similar? [1]

It’s plausible. A triangle has three degrees of freedom: the lengths of the three sides. Specifying the area and perimeter removes two degrees of freedom. Allowing the triangles to be similar rather than congruent accounts for a third degree of freedom.

Here’s another plausibility argument. Heron’s formula computes the area of a triangle from the lengths of the sides.

A = \sqrt{s(s-a)(s-b)(s-c)}

Here s is the semi-perimeter, half of the sum of the lengths of the sides. So if the perimeter and area are known, we have a third order equation for the sides:

(a - s)(b - s)(c - s) = -\frac{A^2}{s}

If the right-hand side were 0, then we could solve for the lengths of the sides. But the right-hand side is not zero. Is it still possible that the sides are uniquely determined, up to rearranging how we label the sides?

It turns out the answer is no [2], and yet it is not simple to construct counterexamples. If all the sides of a triangle are rational numbers, it is possible to find a non-congruent triangle with the same perimeter and area, but the process of finding this triangle is a bit complicated.

One example is the triangles with sides (20, 21, 29) and (17, 25, 28). Both have perimeter 70 and area 210. But the former is a right triangle and the latter is not.

Where did our algebraic argument go wrong? How can a cubic equation have two sets of solutions? But we don’t have a cubic equation in one variable; we have an equation in three variables that is the product of three linear terms.

What third piece of information would specify a triangle uniquely? If you knew the perimeter, area, and the length of one side, then the triangle is determined. What if you specified the center of the triangle? There are many ways to define a center of a triangle; would some, along with perimeter and area, uniquely determine a triangle while others would not?

Related posts

[1] Two triangles are similar if you can transform one into the other by scaling and/or rotation.

[2] Mordechai Ben-Ari. Mathematical Surprises. Springer, 2022. The author sites this blog post as his source.

Settlers versus Hipsters

When my children were little, I read the Little House on the Prairie books aloud to them and I naturally saw the books through the eyes of a child. Last night I started reading the books by myself for the first time and saw them very differently.

Laura Ingalls Wilder wrote the Little House books later in life, looking back at her childhood an early adult years. The events in the books took place in the last quarter of the nineteenth century.

At first glance the books tell the story of the Ingalls family living off the land, which to a large extent they did. The first chapter of the first book describes the family salting and smoking meat in order to have food for the winter when it would be hard to hunt game. Without adequate preparation they would starve, which they nearly do in one of the latter books.

The initial chapter also describes the father greasing his traps. He didn’t smelt iron to make his traps; he had bought the traps somewhere and brought them with him. There no sense in the books that the family was trying to avoid contemporary technology. They gladly used the technology available to them, such as it was.

In addition to hardware such as bear traps, the family also had consumables they could not produce themselves. Where did they get the salt to preserve their meat? They didn’t drive their SUV down to Costco, but neither did they mine salt. And as I wrote about years ago, the books mention coffee, something that doesn’t grow in the continental United States.

Obtaining supplies was difficult, not something they would do lightly or frequently, but there’s no sense that they saw buying supplies as a failing. They were trying to settle new land, but they weren’t trying to get away from contemporary amenities. They did without amenities out of necessity, not out of conviction.