I've redesigned the blog software. The previous version read blog entries from a directory. The new version reads them from a memory-mapped database file which is updated by a separate process.
The previous version tried to do everything in the one C process. Although writing in C is theoretically great for performance, it's not a convenient language for text processing, and the text processing doesn't need great performance. At startup time it processes the directory full of blog entries to extract the main text and metadata. At request time it mostly just needs to concatenate some of those text strings into the socket buffer. 150 lines of C code removed.
C code: 700 lines approx (-150 lines)
Python code: +300 lines approx
I wondered if having all data in memory was faster than reading it from a file each time, so I did a stress-test on both the old and new code with Locust from the same server machine. In both cases it reported around 1030 requests/sec including 1.4 errors/sec. The nginx log showed the error to be a failure to connect to the backend process because the socket accept queue was full; after increasing it, the throughput is the same but without any errors.
1000 requests/sec is obviously plenty for a blog nobody cares about and for Hetzner's cheapest VPS, but it should be possible to achieve a better high score — especially as the load testing tool used the majority of the server's CPU. The only error occurring on the server during the stress test is 502 Bad Gateway, resulting from nginx failing to connect to the backend process because the socket listen queue is full. After increasing it from 10 to 30, the request rate is about the same but without any error responses. I suppose the bottleneck is still something about the SCGI interface, or it's being masked by Locust using all the CPU. Maybe next time I'll try writing an in-process nginx module.
Since all data is loaded from a memory-mapped blob, there's a real risk of read-only buffer overruns. No matter, since all data in the blob is public. In fact, no private information is found anywhere in the blog program's address space — not even SSL keys, since SSL termination is handled by nginx — not even the previous request's cookies, since the request buffer is zeroed between requests.