Comments on: How NOT to test that mysqld is alive https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive Blog by Shlomi Noach Wed, 13 Apr 2011 01:07:01 +0000 hourly 1 https://wordpress.org/?v=5.3.3 By: Ariel https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-36793 Wed, 13 Apr 2011 01:07:01 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-36793 Thanks, Shlomi. I’m just concerned that if MySQL is truly stuck, the query won’t exit or timeout soon enough before the next time it’s supposed to run, causing a cascading effect.

Worse yet, I’m concerned about generating a false positive that would trigger a needless failover. I’m working on a monit-based solution that would hopefully avoid those situations.

Again, thanks for you suggestion.

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-36787 Wed, 13 Apr 2011 00:15:02 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-36787 @Ariel,

A possible heuristic would be to connect and issue a SHOW PROCESSLIST (this does not make for an overhead). By the number of running threads and the time they are waiting ; or if InnoDB Plugin is enabled, by the number of running transactions / locked transactions, you may deduce that things are not going well.
You may choose to monitor the above for 5 seconds, and detect that no thread/transaction has completed since previous sample; this may indicate a “stuck mysql”.

This is just off the top of my head; haven’t put serious thought yet.

]]>
By: Ariel https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-36785 Tue, 12 Apr 2011 23:33:36 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-36785 Suppose someone is interested in monitoring (the unlikely event of) whether MySQL is frozen, overloaded, unresponsive, etc., but it hasn’t crashed — i.e., the pid and sock files exist and the mysql process is listed by ps. In this scenario, whatever method determines the database to be frozen will trigger a DRBD or cluster failover script, which this scenario assumes is already in place. (And the reason that, btw, Nagios can’t help here.)

Now, how would you handle that?

It seems to me that logging in, querying, or doing anything in the database risks complicating things further. What if the query/check doesn’t complete in time? I define “in time” here to mean before the next check, since I assume this check will be running on a cron or monit job at a fairly short interval to justify the failover, but you can choose any definition.

My contention here is that there really is no way to meaningfully check on the internal status of MySQL from the host it’s running on when an automated failover event depends on the outcome. Leaving aside the question of why would anyone would want to be in this scenario (believe me, I already asked the question, and the answer is self-evident), I’d like to know of any possible solutions. All the comments I’ve read here and elsewhere lead me to believe there isn’t.

]]>
By: Vitlalie Cherpec https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-9064 Wed, 06 Jan 2010 11:43:37 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-9064 I’m using Nagios (http://www.nagios.org/) to monitor mysql servers. Reinventing the wheel is too error prone, just stick with stable and tested solutions. If Nagios itself is too complex, Nagios plugins can be used from scripts. :

Usage help for check_mysql plugin:

[vitalie@shark ~]$ /usr/lib64/nagios/plugins/check_mysql –help

check_mysql v2034 (nagios-plugins 1.4.13)
Copyright (c) 1999-2007 Nagios Plugin Development Team

This program tests connections to a mysql server

Usage: check_mysql [-d database] [-H host] [-P port] [-s socket]
[-u user] [-p password] [-S]

[…]

]]>
By: ajd4096 https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-4664 Thu, 08 Oct 2009 06:23:44 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-4664 /proc is not deprecated (nor depreciated) on HP-UX, it never had it, AFAIK.

FreeBSD/OpenBSD/NetBSD have a procfs, but it is not mounted by default.

(The main reason for avoiding procfs is that reading from /proc in user-space suffers from TOCTOU, whereas a system call does not. This issue was raised yet again on the OpenBSD misc@ mailing list only a few weeks ago.)

There is a procfs for OSX, it’s MacFUSE-based, not from Apple, and not even installed by default.

Obviously there is a sizeable percentage of unix systems where a script referring to /proc will not work.

Now, about the ubiquity of that “bash” shell….

]]>
By: Zach https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-4646 Wed, 07 Oct 2009 13:33:21 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-4646 @FredV

you are totally incorrect, /proc is not in every Unix… /proc has been depreciated in older SysV style OS’es… like ..wait for it… HPUX!

]]>
By: FredV https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-4637 Wed, 07 Oct 2009 07:29:07 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-4637 “/proc is not a standard Unix feature, nor is bash.”

that’s why every unix has it?

]]>
By: ajd4096 https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-4633 Wed, 07 Oct 2009 03:46:32 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-4633 @FredV
/proc is not a standard Unix feature, nor is bash.

There is more to unix than the linux distro du-jour.

]]>
By: FredV https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-4615 Tue, 06 Oct 2009 18:57:14 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-4615 Adam: using /proc (a standard Unix feature) and bash-compatible shell script is not portable and brittle? i disagree. asfor extensible: who on earth would want to build something on top and exapand on this? It’s a quick hack anyway, normally you shouldn’t have to do stuff like this.

btw. using /proc, which is *not* as stupid as grepping ps output without even using awk to split into columns. a process can change how it’s displayed in ps by changing argv[0], so the /proc solution I gave is actually the only one that will always work and not brittle as you say

]]>
By: Adam Nelson https://shlomi-noach.github.io/blog/mysql/how-not-to-test-that-mysqld-is-alive/comment-page-2#comment-4607 Tue, 06 Oct 2009 15:48:42 +0000 https://shlomi-noach.github.io/blog/?p=1355#comment-4607 I’m a little disturbed by all the people still posting ps and proc options.

It’s fun that these tricks can be done, but any of that type of custom scripting is not extensible, rarely portable, and highly brittle under a panoply of eventualities.

I know there are systems that have been up for 10 years on this kind of stuff, but it’s not responsible for such band-aids to be allowed to be permanent (or for SysAdmins to think that those are real, proper solutions).

]]>