For purposes of auditing anything that goes on our servers we’re looking to parse the binary logs of all servers (masters), as with “Anemomaster“. With Row Based Replication this is problematic since pt-query-digest does not support parsing RBR binary logs (true for 2.2.12, latest at this time).
I’ve written a simple script that translates RBR logs to SBR-like logs, with a little bit of cheating. My interest is that pt-query-digest is able to capture and count the queries, nothing else. By doing some minimal text manipulation on the binary log I’m able to now feed it to pt-query-digest which seems to be happy.
The script of course does not parse the binary log directly; furthermore, it requires the binary log to be extracted via:
mysqlbinlog --verbose --base64-output=DECODE-ROWS your-mysql-binlog-filemame.000001
The above adds the interpretation of the RBR entires in the form of (unconventional) statements, commented, and strips out the cryptic RBR text. All that is left is to do a little manipulation on entry headers and uncomment the interpreted queries.
The script can be found in my gist repositories. Current version is as follows:
#!/usr/bin/python # # Convert a Row-Based-Replication binary log to Statement-Based-Replication format, cheating a little. # This script exists since Percona Toolkit's pt-query-digest cannot digest RBR format. The script # generates enough for it to work with. # Expecting standard input # Expected input is the output of "mysqlbinlog --verbose --base64-output=DECODE-ROWS <binlog_file_name>" # For example: # $ mysqlbinlog --verbose --base64-output=DECODE-ROWS mysql-bin.000006 | python binlog-rbr-to-sbr.py | pt-query-digest --type=binlog --order-by Query_time:cnt --group-by fingerprint # import fileinput def convert_rbr_to_pseudo_sbr(): inside_rbr_statement = False for line in fileinput.input(): line = line.strip() if line.startswith("#") and "end_log_pos" in line: for rbr_token in ["Update_rows:", "Write_rows:", "Delete_rows:", "Rows_query:", "Table_map:",]: if rbr_token in line: line = "%s%s" % (line.split(rbr_token)[0], "Query\tthread_id=1\texec_time=0\terror_code=0") if line.startswith("### "): inside_rbr_statement = True # The "### " commented rows are the pseudo-statement interpreted by mysqlbinlog's "--verbose", # and which we will feed into pt-query-digest line = line[4:] else: if inside_rbr_statement: print("/*!*/;") inside_rbr_statement = False print(line) convert_rbr_to_pseudo_sbr()
No longer needed since the last release of pt-query-digest
which release was this?