{"id":2463,"date":"2010-06-04T11:17:27","date_gmt":"2010-06-04T09:17:27","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=2463"},"modified":"2010-06-04T11:17:58","modified_gmt":"2010-06-04T09:17:58","slug":"mycheckpoint-rev-132-custom-monitoring-custom-charts-process-list-dump","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/mycheckpoint-rev-132-custom-monitoring-custom-charts-process-list-dump","title":{"rendered":"mycheckpoint (rev. 132): custom monitoring, custom charts, process list dump"},"content":{"rendered":"<p>Revision <strong>132<\/strong> of <a href=\"..\/..\/forge\/mycheckpoint\">mycheckpoint<\/a> has been released. New and updated in this revision:<\/p>\n<ul>\n<li>Custom monitoring: monitoring &amp; charting for user defined queries<\/li>\n<li>HTML reports for custom monitoring<\/li>\n<li>Process list dump upon alert notifications<\/li>\n<\/ul>\n<h4>Custom monitoring &amp; charts<\/h4>\n<p>Custom monitoring allows the user to supply with a query, the results of which will be monitored.<\/p>\n<p>That is, <em>mycheckpoint<\/em> monitors the status variables, replication status, OS metrics. But it cannot by itself monitor one&#8217;s <em>application<\/em>. Which is why a user may supply with such query as:<\/p>\n<blockquote>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nSELECT COUNT(*) FROM shopping_cart WHERE is_pending=1\r\n<\/pre>\n<\/blockquote>\n<p>Such a query will tell an online store how many customers are in the midst of shopping. There is no argument that this number is worth monitoring for. Given the above query, <em>mycheckpoint<\/em> will execute it per sample, and store the query&#8217;s result along with all sampled data, to be then aggregated by complex views to answer for:<\/p>\n<ul>\n<li>What was the value per given sample?<\/li>\n<li>What is the value difference for each sample?<\/li>\n<li>What is the change per second, i.e. the rate?<\/li>\n<\/ul>\n<p>mycheckpoint goes one step forward, and explicity records another metric:<\/p>\n<ul>\n<li>How much time did it take to take that sample?<\/li>\n<\/ul>\n<p><!--more-->As another example, a query worth testing for rate:<\/p>\n<blockquote>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nSELECT MAX(shopping_cart_id) FROM shopping_cart\r\n<\/pre>\n<\/blockquote>\n<p>The above will provide with the last id. Assuming this is <strong>AUTO_INCREMENT<\/strong>, and assuming we&#8217;re on <strong>auto_increment_increment=1<\/strong>, two samples will allow us to get the number of created carts between those samples. Now, here&#8217;s a metric I&#8217;d like to read:<\/p>\n<ul>\n<li>How many carts are created per second, for each hour of the day?<\/li>\n<\/ul>\n<p>We get all these for free with mycheckpoint, which already does this analysis. All we need to provide is the query, and how we would like it to be visualized (visualization is optional, it is not the only way to diagnose monitored data) graphically:<\/p>\n<blockquote>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nINSERT INTO\r\n custom_query (custom_query_id, enabled, query_eval, description, chart_type, chart_order)\r\n VALUES (0, 1, 'SELECT COUNT(*) FROM store.shopping_cart WHERE is_pending=1', 'Number of pending carts', 'value', 0);\r\n\r\nINSERT INTO\r\n custom_query (custom_query_id, enabled, query_eval, description, chart_type, chart_order)\r\n VALUES (1, 1, 'SELECT MAX(shopping_cart_id) FROM store.shopping_cart', 'Created carts rate', 'value_psec', 0);\r\n<\/pre>\n<\/blockquote>\n<p>We can later query for these values, just like we do for normal monitored values:<\/p>\n<blockquote>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\nmysql&gt; SELECT id, ts, created_tmp_tables_psec, custom_0, custom_1_psec FROM sv_sample WHERE ts &gt;= NOW() - INTERVAL 1 HOUR;\r\n+-------+---------------------+-------------------------+----------+---------------+\r\n| id\u00a0\u00a0\u00a0 | ts\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 | created_tmp_tables_psec | custom_0 | custom_1_psec |\r\n+-------+---------------------+-------------------------+----------+---------------+\r\n| 50730 | 2010-05-21 19:05:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 16.64 |\u00a0\u00a0\u00a0\u00a0\u00a0 448 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 3.02 |\r\n| 50731 | 2010-05-21 19:10:02 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 20.97 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 89 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1.73 |\r\n| 50732 | 2010-05-21 19:15:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 15.70 |\u00a0\u00a0\u00a0\u00a0\u00a0 367 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 3.56 |\r\n| 50733 | 2010-05-21 19:20:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 18.32 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 54 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1.43 |\r\n| 50734 | 2010-05-21 19:25:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 16.42 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 91 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1.96 |\r\n| 50735 | 2010-05-21 19:30:02 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 21.93 |\u00a0\u00a0\u00a0\u00a0\u00a0 233 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2.11 |\r\n| 50736 | 2010-05-21 19:35:02 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 14.58 |\u00a0\u00a0\u00a0\u00a0\u00a0 176 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1.91 |\r\n| 50737 | 2010-05-21 19:40:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 21.61 |\u00a0\u00a0\u00a0\u00a0\u00a0 168 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1.93 |\r\n| 50738 | 2010-05-21 19:45:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 16.05 |\u00a0\u00a0\u00a0\u00a0\u00a0 241 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2.44 |\r\n| 50739 | 2010-05-21 19:50:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 19.70 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 46 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1.19 |\r\n| 50740 | 2010-05-21 19:55:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 15.85 |\u00a0\u00a0\u00a0\u00a0\u00a0 177 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2.28 |\r\n| 50741 | 2010-05-21 20:00:01 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 19.04 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 8 |\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.82 |\r\n+-------+---------------------+-------------------------+----------+---------------+\r\n<\/pre>\n<\/blockquote>\n<p>Of course, it is also possible to harness <em>mycheckpoint<\/em>&#8216;s views power to generate charts:<\/p>\n<blockquote>\n<pre>mysql&gt; SELECT custom_1_psec FROM sv_report_chart_sample\\G\r\n<img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" title=\"custom_1_psec\" src=\"http:\/\/chart.apis.google.com\/chart?cht=lc&amp;chs=400x200&amp;chts=303030,12&amp;chtt=Latest+24+hours:+May+19,+20:10++-++May+20,+20:10&amp;chf=c,s,ffffff&amp;chdl=custom_1_psec&amp;chdlp=b&amp;chco=ff8c00&amp;chd=s:QfXQmZQhXTmWVkWRobPpWUtQPVROaOOUMJPOKdJHQJFJEDJJEGCAIEFJHFFEGGDQHGJGMJPPMNZNRWR_ZUWfR_nSjuUcaXa3OgxRl4UivWZ5UhtWX4VgnUTYktiVW9WanUVxVYlgXwVdicXpb&amp;chxt=x,y&amp;chxr=1,0,5.120000&amp;chxl=0:||+||00:00||+||04:00||+||08:00||+||12:00||+||16:00||+||20:00|&amp;chxs=0,505050,10,0,lt&amp;chg=4.17,25,1,2,3.47,0&amp;chxp=0,3.47,7.64,11.81,15.98,20.15,24.32,28.49,32.66,36.83,41.00,45.17,49.34,53.51,57.68,61.85,66.02,70.19,74.36,78.53,82.70,86.87,91.04,95.21,99.38\" alt=\"\" width=\"400\" height=\"200\" \/>\r\n<\/pre>\n<\/blockquote>\n<p>The rules are:<\/p>\n<ul>\n<li>There can (currently) only be 18 custom queries.<\/li>\n<li>The <strong>custom_query_id<\/strong> must range 0-17 (to be lifted soon).<\/li>\n<li>A custom query must return with <em>exactly<\/em> one row, with <em>exactly<\/em> one column, which is a kind of <em>integer<\/em>.<\/li>\n<\/ul>\n<p>Please read <a href=\"http:\/\/code.openark.org\/blog\/mysql\/things-to-monitor-on-mysql-the-users-perspective\">my earlier post<\/a> on custom monitoring to get more background.<\/p>\n<h4>Custom monitoring HTML reports<\/h4>\n<p>Custom monitoring comes with a HTML reports, featuring requested charts. See a <a href=\"http:\/\/code.openark.org\/blog\/wp-content\/uploads\/2010\/05\/mcp_custom_report-128.html\">sample custom report<\/a>.<\/p>\n<p>In this sample report, a few queries are monitored for value (pending rentals, pending downloads) and a few for rates (downloads per second, emails per second etc.).<\/p>\n<p>Custom HTML reports come in two flavors:<\/p>\n<ul>\n<li>Brief reports, featuring last 24 hours, as in the example above. These are handled by the <strong>sv_custom_html_brief<\/strong> view.<\/li>\n<li>Full reports, featuring last 24 hours, last 10 days, known history. These take longer to generate, and are handled by the <strong>sv_custom_html<\/strong> view.<\/li>\n<\/ul>\n<p>The sample report was generated by issuing:<\/p>\n<blockquote>\n<pre>SELECT html FROM sv_custom_html_brief;<\/pre>\n<\/blockquote>\n<p>I won&#8217;t go into details here as for how this view generates the HTML code. There is a myriad of view dependencies, with many interesting tricks on the way. But do remember it&#8217;s <em>just a view<\/em>. You don&#8217;t need an application (not even <em>mycheckpoint<\/em> itself) to generate the report. All it takes is a query.<\/p>\n<h4>Processlist dump<\/h4>\n<p>When an alert notification fires (an email is prepared to inform on some alert condition), a processlist dump summary is taken and included in email report. It may be useful to understand why the slave is lagging, or exactly why there are so many active threads.<\/p>\n<p>The dump summary presents the processlist much as you would see it on SHOW PROCESSLIST, but only lists the active threads, noting down how many sleeping processes there are (PS, thread &amp; process are the same in the terminology of MySQL connections). An example dump looks like this:<\/p>\n<blockquote>\n<pre>PROCESSLIST summary:\r\n\r\n     Id: 3\r\n   User: system user\r\n   Host:\r\n     db: NULL\r\nCommand: Connect\r\n   Time: 3168098\r\n  State: Waiting for master to send event\r\n   Info: NULL\r\n-------\r\n\r\n     Id: 4\r\n   User: system user\r\n   Host:\r\n     db: prod_db\r\nCommand: Connect\r\n   Time: 612\r\n  State: Updating\r\n   Info: UPDATE user SET is_offline = 1 WHERE id IN (50440010,50440011)\r\n-------\r\n\r\n     Id: 8916579\r\n   User: prod_user\r\n   Host: localhost\r\n     db: prod_db\r\nCommand: Query\r\n   Time: 1\r\n  State: Sending data\r\n   Info: INSERT IGNORE INTO archive.stat_archive (id, origin, path, ts, content\r\n-------\r\n\r\n     Id: 8916629\r\n   User: mycheckpoint\r\n   Host: localhost\r\n     db: NULL\r\nCommand: Query\r\n   Time: 0\r\n  State: NULL\r\n   Info: SHOW PROCESSLIST\r\n-------\r\nSleeping: 3 processes\r\n<\/pre>\n<\/blockquote>\n<h4>Future plans<\/h4>\n<p>Work is going on. These are the non-scheduled future tasks I see:<\/p>\n<ul>\n<li>Monitoring InnoDB Plugin &amp; XtraDB status.<\/li>\n<li>Interactive charts. See my <a href=\"http:\/\/code.openark.org\/blog\/mysql\/static-charts-vs-interactive-charts\">earlier post<\/a>.<\/li>\n<li>Monitoring for swap activity (Linux only).<\/li>\n<li>Enhanced custom queries handling, including auto-deploy upon change of custom queries.<\/li>\n<li>A proper <em>man<\/em> page.<\/li>\n<li>Anything else that interests me.<\/li>\n<\/ul>\n<h4>Try it out<\/h4>\n<p>Try out <em>mycheckpoint<\/em>. It\u2019s a different kind of monitoring solution. You will need basic SQL skills, and in return you&#8217;ll get a lot of power under your hands.<\/p>\n<ul>\n<li>Download mycheckpoint <a href=\"https:\/\/code.google.com\/p\/mycheckpoint\/\">here<\/a><\/li>\n<li>Visit the project\u2019s <a href=\"..\/..\/forge\/mycheckpoint\">homepage<\/a><\/li>\n<li>Browse the <a href=\"..\/..\/forge\/mycheckpoint\/documentation\">documentation<\/a><\/li>\n<li>Report <a href=\"https:\/\/code.google.com\/p\/mycheckpoint\/issues\/list\">bugs<\/a><\/li>\n<\/ul>\n<p><em>mycheckpoint<\/em> is released under the <a href=\"http:\/\/www.opensource.org\/licenses\/bsd-license.php\">New BSD License<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Revision 132 of mycheckpoint has been released. New and updated in this revision: Custom monitoring: monitoring &amp; charting for user defined queries HTML reports for custom monitoring Process list dump upon alert notifications Custom monitoring &amp; charts Custom monitoring allows the user to supply with a query, the results of which will be monitored. That [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[41,46,49,31,50],"class_list":["post-2463","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-graphs","tag-monitoring","tag-mycheckpoint","tag-python","tag-scripts"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-DJ","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2463","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=2463"}],"version-history":[{"count":42,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2463\/revisions"}],"predecessor-version":[{"id":2554,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/2463\/revisions\/2554"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=2463"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=2463"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=2463"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}