a73x

high effort, low reward

← Posts

Levels of Optimisation

Table of Contents

This probably isn't strictly true, but it makes sense to me. We've got three levels of "optimisation" (assuming your actual design doesn't suck and needs optimising).

Benchmark Optimisation

To begin with, we have benchmark optimisation; you create a benchmark locally, dump a profile of it, and optimise it. Then, you run your tests because the most optimal solution is "return nil" and make sure you didn't break your tests. This is the first and easiest optimisation because it only requires a function, nothing else, and can be done in isolation. You don't need a working "application" here, just the function you're trying to benchmark. There are different types of benchmarks, micro, macro, etc., but I'm leaving them out of scope for this conversation. Go read Efficient Go.

Profile guided optimisation

This is a mild step up from benchmark optimisation only because you need a live server load from which you use to pull a profile, but it is probably the most hands-off step. You import the net/http/pprof package into your service, call the debug/profile?seconds=30 to get a profile, and compile your binary with go build -pgo=profile.pgo. The compiler will make optimisations for you, and even if your profile is garbage, it shouldn't cause any regressions.

You probably want to get a few profiles and merge them using go tool pprof -proto a.out b.out > merged. This will help provide optimisations that are more relevant to your overall system; instead of just a single 30s slice. Also, if you have long-running calls that are chained together, a 30-second snapshot might not be enough, so try a sample with a longer window.

Runtime optimisation

This is where you expose /runtime/metrics and monitor them continuously. There's a list of metrics that you might be interested in, a recommended set of metrics, and generally, you are looking to optimise your interactions with the go runtime. There are a few stats here: goroutine counts, goroutines waiting to run, heap size, how often garbage collection runs, how long garbage collection takes, etc. All useful information to use when optimising - when garbage collection is running, your program ain't. It's also useful for finding memory leaks; it becomes pretty obvious you are leaking goroutines when you graph the count and just watch it go up and never down. It's also just lowkey fun to look at the exposed data and understand what your system is doing.