{"id":6682,"date":"2013-12-19T10:58:17","date_gmt":"2013-12-19T08:58:17","guid":{"rendered":"http:\/\/code.openark.org\/blog\/?p=6682"},"modified":"2013-12-19T10:58:17","modified_gmt":"2013-12-19T08:58:17","slug":"bash-script-report-largest-innodb-files","status":"publish","type":"post","link":"https:\/\/code.openark.org\/blog\/mysql\/bash-script-report-largest-innodb-files","title":{"rendered":"Bash script: report largest InnoDB files"},"content":{"rendered":"<p>The following script will report the largest InnoDB tables under the data directory: schema, table &amp; length in bytes. The tables could be non-partitioned, in which case this is simply the size of the corresponding <strong>.ibd<\/strong> file, or they can be partitioned, in which case the reported size is the sum of all partition files. It is assumed tables reside in their own tablespace files, i.e. created with <strong>innodb_file_per_table=1<\/strong>.<\/p>\n<blockquote>\n<pre>(\r\n\u00a0\u00a0\u00a0 mysql_datadir=$(grep datadir \/etc\/my.cnf | cut -d \"=\" -f 2)\r\n\u00a0\u00a0\u00a0 cd $mysql_datadir\r\n\u00a0\u00a0\u00a0 for frm_file in $(find . -name \"*.frm\")\r\n\u00a0\u00a0\u00a0 do\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 tbl_file=${frm_file\/\/.frm\/.ibd}\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 table_schema=$(echo $frm_file | cut -d \"\/\" -f 2)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 table_name=$(echo $frm_file | cut -d \"\/\" -f 3 | cut -d \".\" -f 1)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if [ -f $tbl_file ]\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 then\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 # unpartitioned table\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 file_size=$(du -cb $tbl_file 2&gt; \/dev\/null | tail -n 1) \r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 else\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 # attempt partitioned innodb table\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 tbl_file_partitioned=${frm_file\/\/.frm\/#*.ibd}\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 file_size=$(du -cb $tbl_file_partitioned 2&gt; \/dev\/null | tail -n 1)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 fi\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 file_size=${file_size\/\/total\/}\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 # Replace the below with whatever action you want to take,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 # for example, push the values into graphite.\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 echo $file_size $table_schema $table_name\r\n\u00a0\u00a0\u00a0 done\r\n) | sort -k 1 -nr | head -n 20<\/pre>\n<\/blockquote>\n<p>We use this to push table statistics to our graphite service; we keep an eye on table growth (we actually do not limit to top <strong>20<\/strong> but just monitor them all). File size does not report the real table data size (this can be smaller due to tablespace fragmentation). It does give the correct information if you&#8217;re concerned about disk space. For table data we also monitor <strong>SHOW TABLE STATUS<\/strong> \/ <strong>INFORMATION_SCHEMA.TABLES<\/strong>, themselves being inaccurate. Gotta go by something.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The following script will report the largest InnoDB tables under the data directory: schema, table &amp; length in bytes. The tables could be non-partitioned, in which case this is simply the size of the corresponding .ibd file, or they can be partitioned, in which case the reported size is the sum of all partition files. [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[5],"tags":[25,46,50],"class_list":["post-6682","post","type-post","status-publish","format-standard","hentry","category-mysql","tag-analysis","tag-monitoring","tag-scripts"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p2bZZp-1JM","_links":{"self":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/6682","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/comments?post=6682"}],"version-history":[{"count":13,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/6682\/revisions"}],"predecessor-version":[{"id":6696,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/posts\/6682\/revisions\/6696"}],"wp:attachment":[{"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/media?parent=6682"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/categories?post=6682"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code.openark.org\/blog\/wp-json\/wp\/v2\/tags?post=6682"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}