A client called me several times complaining random slowness of his site which is running a fairly old Fedora 2 with PHP 4 and MySQL 3.
The problem he had is when he access a particular page, sometime, it goes super fast, sometime, it could take over 5 minutes or more to get the result back. There're only around less than 50 people using the site at the same time. Challenge is there's no fix pattern to replicate the issue. And I personally played with the page for over 20 minutes, and never was able to replicate the issue.
With lots of frustration, I started my journey to resolve the issue.
1. My friend google.com
Searched keywords like: slow php mysql linux
Found some hints on how to optimize mySQL and PHP. Nothing really help. However, I did find a debug tool to log the slow queries.
2. Log slow queries
Then I changed my.cnf and added:
set-variable = long_query_time=3
log-slow-queries=/var/log/mysql/slow-query
Please be advised, if you do not put
set-variable =
in front of long_query_time=3, it will not work, and mySQL will not start.
So I got a nice report of slow queries. From there, I found out some queries without sufficient index. After adding indices, I'm sure lots of queries got improved, however, the particular issue is still there. The big random slowness still happen.
4. mytop
Found a nice tool to monitor mySQL:
http://jeremy.zawodny.com/mysql/mytop/
Installation was easy. It requires PERL and some extra modlues. CPAN did a nice work to help me install them, although it took around half hours to do the upgrading, downloading and installation. Good things is I got lots of things to do while waiting.
In order to make it easy to run mytop, need to create a config file:
~/.mytop
user=<your user name here, I used admin>
pass=<your password here>
host=localhost
db=<your DB name here>
delay=5
port=3306
socket=
batchmode=0
header=1
color=1
1dle=1
Run mytop helped me verified my optimization of queries were working. Again, the issue is still there.
3. Check error log of PHP
Went to check error log of Apache/PHP. WOW! the error log file and access log reached over 600M each!!!
That might explain why it goes so slow. When multiple requests sent to server, to handle each request, server need to log something. To append text at end of a 600M file cannot be fast, not to mention having multiple threads/process racing for the change.
So backed up the log, restarted apache. Looks like things are getting better!
4. Restart Server
Some how found the server has over 2800 tasks. Most of them are related to rpm, awk, etc.
Something like below:
4 0 13076 12539 16 0 2456 264 pipe_w S ? 0:00 awk -v progname=/etc/cron.daily/rpm progname {????? print progname ":\n"????? progname="";???? }???? { print; }
4 0 13077 13075 16 0 6280 432 - S ? 0:00 /usr/lib/rpm/rpmq -q --all --qf %{name}-%{version}-%{release}.%{arch}.rpm\n
4 0 13078 13075 18 0 28856 336 pipe_w S ? 0:00 sort
5 0 19864 1 15 0 10708 1340 pipe_w S ? 0:00 /usr/libexec/webmin/shell/index.cgi
Not sure what they are. So I decided to reboot server. Task number dropped to less than 100.
After the journey, server is working OK so far. No report of slowness yet. But I will give it a couple of more days to confirm.