How I made a Drupal site fast enough, one problem at a time

Tools

Here are the tools I used to diagnose speed problems:

Thrashing

Symptom
The site slowed down or crashed frequently.
ssh commandline interface was slow
Diagnosis
top showed a lot of httpd processes
vmstat 5 showed lots of swapping in and out
Simulating ten simultaneous anonymous users with apachebench, e.g.
ab -c 10 http://my-site.com/
reliably caused site to slow down and/or crash
Fix
Reduced MaxClients to 5 (the number of httpd processs that fit in this VM's RAM)

Bad interaction between calendar and boost

Symptom
It took two minutes to save new events in the Drupal GUI
Diagnosis
Turned off modules at a time and measured time to save event after each one. The bad module was calendar_ical... but I couldn't leave that disabled.
Ran mytop and noticed one of the slow queries had the word Boost in it.
Fix
Turned off Boost, which evidently interacts poorly with calendar_ical when you have mini-calendars on each page.

Site sluggish for logged-in users (no PHP caching)

Symptom
It took four seconds to load home page when logged in to Drupal
With Drupal page cache off, ran apachebench to measure time to fetch the bare front page, e.g.
ab -n 10 http://my-site.com/
Median result was well over 1000 milliseconds, far too slow.
Diagnosis
Created a phpinfo page; loading that in a browser showed APC was not present
Fix
Installed APC
(On my shared host, this mean sending support an email to request it
On my dev site, this meant sudo apt-get install php-apc.)
Created an apc.php page
Visited several pages of my site while logged in
Opened apc.php in a browser and checked the APC cache
On my shared host, noticed the APC cache was completely empty; this was because I was running old PHP; the workaround was to tweak bookstrap.ini per php bug 59541
On my dev site, noticed the APC cache was too full; fixed by raising size in apc.ini from default of 32MB to 64MB
Verified that visiting many pages and then looking at apc.php showed good hit rate
Re-measured fetch time, it was now 20% lower. Still not great, but better than it was.

Site still sluggish for logged-in users (no MySQL caching)

Symptom
It took three seconds to load home page when logged in to Drupal
Logged into site as a non-admin user, got the SESSxxxx cookie name and value from my browser, then used Apacebench to to measure time to fetch the bare front page as a logged in user, e.g.
ab -n 10 -c 1 -C SESS7b58937bb79477baad3c40daa99acd1b=aq87brml3780szkxrwck9tjb43 http://my-site.com/
Median result was well over 1000 milliseconds, far too slow.
Diagnosis
Got a MySQL prompt using 'drush sql cli' or 'drush sqlc'
Checked whether MySQL query caching was enabled with the command "show variables like 'have_query_cache';" (it was)
Checked whether it was enabled with the command "show variables like 'query%';". query_cache_type was not ON, so the cache was not active
On my dev site, this meant changing my.cnf to set query_cache_type=1.
On my shared site, I was not allowed to change that globally, so I edited includes/database.mysql.inc to force it on.
Confirmed that hit rate was nonzero with query "SHOW STATUS LIKE 'Qc%';"
Re-measured fetch time, it was now 20% lower. Still not great, but better than it was.

Site still sluggish for logged-in users (no page caching)

Symptom
It took two seconds to load home page when logged in to Drupal
In Chrome, open the site and log in as a normal user, then start Chrome Developer Tools and see how long it takes to load. Time to load just the main page HTML was well over 1000 milliseconds, too slow.
Diagnosis
None needed. Clearly, no page cache is in use. This isn't suprising, since it's kind of hard to cache the pages served to authorized users, and it doesn't make sense on all sites.
Fix
Learned about authcache, realized it was a good match for this site, installed it (without memcache, even) with the commands
drush dl authcache-6.x-1.x-dev
drush enable authcache
Configured authcache
Enabled authcache debug, and verified that it showed a speedup in cache_render after the first load of a page.
I then extracted a list of frequently fetched URLs from the apache log and saved it into a file urls.txt, then wrote a script that warmed up the cache (see warm-cache.sh). I now run that script right after cron.
Verified that page loads are snappy for logged-in users, at least for the URLs warmed up by warm-cache.sh.