Thursday 2 July 2015

mysqld crashes with trap divide error

I started getting alerts last night from applications being unable to connect to a mysql server.  I checked the logs and kept seeing the following being repeated every minute:

Jul  2 08:52:56 mysqlcluster1 /etc/mysql/debian-start[5569]: Upgrading MySQL tables if necessary.
Jul  2 08:52:56 mysqlcluster1 /etc/mysql/debian-start[5572]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
Jul  2 08:52:56 mysqlcluster1 /etc/mysql/debian-start[5572]: Looking for 'mysql' as: /usr/bin/mysql
Jul  2 08:52:56 mysqlcluster1 /etc/mysql/debian-start[5572]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
Jul  2 08:52:56 mysqlcluster1 /etc/mysql/debian-start[5572]: This installation of MySQL is already upgraded to 5.5.24, use --force if you still need to run mysql_upgrade
Jul  2 08:52:56 mysqlcluster1 /etc/mysql/debian-start[5583]: Checking for insecure root accounts.
Jul  2 08:52:56 mysqlcluster1 /etc/mysql/debian-start[5588]: Triggering myisam-recover for all MyISAM table
Jul  2 08:53:37 mysqlcluster1 kernel: [8198166.809336] mysqld[5562] trap divide error ip:7f1a23a9ae2e sp:7f1a23261400 error:0 in mysqld[7f1a233cf000+a8a000]
Jul  2 08:53:37 mysqlcluster1 kernel: [8198167.130770] init: mysql main process (5155) killed by FPE signal
Jul  2 08:53:37 mysqlcluster1 kernel: [8198167.130836] init: mysql main process ended, respawning
Jul  2 08:53:37 mysqlcluster1 kernel: [8198167.147935] type=1400 audit(1435823617.958:44): apparmor="STATUS" operation="profile_replace" name="/usr/sbin/mysqld" pid=6160 comm="apparmor_parser"

mysqld is obviously crashing for some reason.  Today is 2nd July and one of our applications pulls nightly data into a table with over 2bn rows that is partitioned by month then other processes access and process that data later on.  I remembered some weirdness in the past when we had forgotten to add more partitions so I logged into the application server and shut down all processes that may be trying to select or insert data to that table and the database server stayed up and didn't crash.  Excellent, that must be the issue.  Sure enough now that mysqld was up and I could access the database I found the table only had partition definitions up to end of June.  I added a partition for July and restarted all of the application processes.

For reference the versions this is running are:
guy@mysqlcluster1:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04.5 LTS
Release: 12.04
Codename: precise
guy@mysqlcluster1:~$ mysqld --version
mysqld  Ver 5.5.24-0ubuntu0.12.04.1 for debian-linux-gnu on x86_64 ((Ubuntu))