Why is my Drupal site so slow?
There are a bazillion reasons why Drupal is slow, but the main one is
because you weren't paying attention. You have to
measure performance of your site every day during development
so you can tell what made it slow, and undo it.
See also
The page "How I made a Drupal site fast enough, one
problem at a time" is a more stripped-down writeup of this same info.
How I'm attacking the problem
Since I know next to nothing about Drupal, I started off small
by writing a script to clone drupal from git, install it, populate
a site with fake data, run a benchmark on it, back it up, restore it,
or remove everything.
You can see the draft script at busydrupal.sh.txt.
After two days writing and playing with that, I felt confident enough
to bring up the site in question on my local workstation.
Having a private version of the site gave me the freedom to try things.
While debugging "calendar event creation taking two minutes", here's what I tried:
- Turning off drupal logging
- Mounting / with noatime,nodiratime
- Mounting / with data=writeback
- Turning off nearly every module, noticed it was fast but broken, then
turned them on one at a time until it slowed down... first module
that made a big difference was calendar_ical.
- Ran mytop (easy to use remotely, just need mysql connection), saw boost
cache query was taking a long time. Disabling module boost made a big
difference (evidently caching isn't always a good idea?).
Things I've run into and fixed
These need to be checked periodically, as the fixes don't always last.
- Slow content creation - creating a calendar event took two minutes (!).
After disabling modules calendar_ical, boost, dblog, and performance,
time to create a calendar event dropped to three seconds.
- Thrashing - httpd takes 80MB per session, and we had 30 sessions, all bots;
the site would occasionally just shudder to a halt from lack of memory.
Watching the active processes with 'top' or the paging activity with 'vmstat 5'
while doing ten concurrent fetches using ab showed what was going on.
Reducing
MaxClients to the absurdly low value of 5 was an ok workaround,
since there aren't that many valid concurrent users at the site in question.
This was absolutely vital, at least until we figure out why the site is such a pig.
- Bots - The access lots showed very frequent accesses by bots,
so in robots.txt, we raised Crawl-delay to 60 seconds, and added
Disallow: lines for the Calendar area of our site.
(Unfortunately, some but not all big name bots still hit us
once every five seconds, so robots.txt isn't a perfect instrument.
- Spammers - We added 'deny from' rules in .htaccess
for badly behaved bots... and even block entire class C or even B networks
when there are more than one miscreant sharing similar addresses.
(There
are probably better ways.) Here's how we looked for them; this should probably
be done in crontab.
- The command
tail -f www_logs/access_log
shows current activity. Any bot that hits you faster than robots.txt allows
is a candidate for blocking. (But, alas, google misbehaves here, and they can get
away with it.)
- The line
grep register www_logs/access_log | awk '{print $2}' | sort | uniq -c | sort -nr | awk '$1 > 7 {print $2}' | sort -n
shows spambots which repeatedly try to register fake accounts.
- The line
grep 'POST /.q=user' www_logs/access_log | awk '{print $2}' | sort | uniq -c | sort -nr | awk '$1 > 7 {print $2}' | sort -n
shows attackers repeatedly trying to log in.
Oddly, after a few rounds of this, we didn't have many spam registrations.
Evidently the bad guys are concentrate in a smallish number of netblocks.
- Overloaded MySQL server - Site was on a shared MySQL database server that was overloaded -
noticed by our hosting provider. They moved us to a less busy database,
which helped a bit.
- Doppelgangers - Site had many aliases (foo.org, foo.net, foo.com, foo.bat, foo.exe),
and some bots were indexing more than one of them. Reduced bot traffic
slightly by adding the lines
RewriteCond %{HTTP_HOST} !^www\.foo\.org [NC]
RewriteRule ^(.*)$ http://www.foo.org/$1 [L,R=301]
right after the line "RewriteEngine on". (After Boost's rewrite
rules was definitely not the right place :-)
- APC didn't actually cache anything... turns out my web hosting
provider installed a version of APC that suffers from
PHP bug 59541,
and I had to work around that by modifying drupal_load() in
include/bootstrap.php to use an absolute path, e.g.:
- include_once "./$filename";
+ include_once "/usr/home/uuccsm/public_html/$filename";
This shaved 300 ms off apachebench.
- MySQL query cache was set to ONDEMAND rather than ON, and
my hosting provider doesn't let me modify my.cnf... worked around that
by modifying db_connect() in includes/database.mysql.inc to force it on, e.g.
+ mysql_query('SET SESSION query_cache_type=1;', $connection);
+
return $connection;
This shaved 600 ms off apachebench.
Remaining problems
- Non-anonymous users still take 2300 ms to fetch a random page, even
without loading the images it refers to
- Loading a random page still requires 100 HTTP transfers
- All pages are customized with the user's login name, so when
you're logged in, no pages are cached, even if they are otherwise
identical for all users and anonymous users
See also