What to do with a down Magento site

1. Application level logs – First place to look.

If you are seeing the very-default-looking Magento page saying “There has been an error processing your request”, then look in here:

ls -lart <DOCROOT>/var/report/ | tail

The stack trace will be in the latest file (there might be a lot), and should highlight what broke.
Maybe the error was in a database library, or a Redis library…see next step if that’s the case.

General errors, often non-fatal, are in <DOCROOT>/var/log/exception.log
Other module-specific logs will be in the same log/ directory, for example SagePay.

NB: check /tmp/magento/var/ .
If the directories in the DocumentRoot are not writable (or weren’t in the past), Magento will use /tmp/magento/var and you’ll find the logs/reports/cache in there.

2. Backend services – Magento fails hard if something is inaccessible

First, find the local.xml. It should be under <DOCROOT>/app/etc/local.xml or possibly a subdirectory like <DOCROOT>/store/app/etc/local.xml

From that, take note of the database credentials, the <session_save>, and the <cache><backend>. If there’s no <cache> section, then you are using filesystem so it won’t be memcache or redis.

– Can you connect to that database from this server? authenticate? or is it at max-connections?
– To test memcache, “telnet host 11211” and type “STATS“.
– To test Redis, “telnet host 6379” and type “INFO”.
You could also use:

redis-cli -s /tmp/redis.sock -a PasswordIfThereIsOne info

 

If you can’t connect to those from the web server, check that the relevant services are started, pay close attention to the port numbers, and make sure any firewalls allow the connection.
If the memcache/redis info shows evictions > 0, then it’s probably filled up at some point and restarting that service might get you out of the water.

ls -la /etc/init.d/mem* /etc/init.d/redis*

3. Check the normal places – sometimes it’s nothing to do with Magento!

  • – PHP-FPM logs – good place for PHP fatal errors. usually in var/log/php[5]-fpm/

– Apache or nginx logs
– Is Apache just at MaxClients?
– PHP-FPM max_children?

ps aux | grep fpm | grep -v root | awk '{print $11, $12, $13}' | sort | uniq -c

– Is your error really just a timeout, because the server’s too busy?
– Did OOM-killer break something?

grep oom /var/log/messages /var/log/kern.log

– Has a developer been caught out by apc.stat=0 (or opcache.validate_timestamp=0) ?

 

Credits: https://willparsons.tech/