Binary Log Growth Handling in MySQL

By | December 19, 2016

So today we look at an aspect of capacity management or planning and how it relates to MySQL and its Logging.  For this blog I’m primarily going to look at MySQL’s binary logs. They have a tendency to grow, and will continue to if not held in check.  Happily, they tend to be easy to manage. Let’s look.

server-logs

Server Logs in General

The chart above comes from the “MySQL Server Logs” documentation page.  It notes a variety of logs in the MySQL server.  The error log of course is where the server provides output for the servers startup, shutdown, and other warning or error events that occur.  The general query log  and the slow query log had more value in the past, but with monitoring capabilities these days, such as the MySQL Enterprise Monitor that expose so much information about what is going on in the server, maybe there is less use for them.  I usually leave both the slow and general turned off, which is their default.  The slow query log, like the general log can both contribute a lot of file writing overhead if the MySQL server is busy running requests.  That being said, if you’re not using a monitoring system, then using the slow query log along with the mysqldumpslow client utility will help to analyze those logs.

replication-chain

The Binary Logs, when are they Used?

The Binary Logs in MySQL can be rather important, but it depends on many things, one of the most relevant things is if you are even using them.  The binary log, or --log-bin option is OFF by default.  So using it at all means your configuration has made this happen.

Backup and Recovery

If you are using the Binary Logs, then they receive all DML and DDL activity that occurs in the server.  This is different than say the InnoDB transaction logs, which are storage engine dependent.  The InnoDB transaction logs handle the ACID component of InnoDB activities.  But all DML or DDL activities against InnoDB and non-InnoDB storage engines get picked up by the server’s binary logs.  For this reason, binaries logs are also a core feature for point-in-time recovery of the whole server.  Thus they can become part of your backup and recovery methodology.

If you are enabling Binary Logs for backup/recovery capabilities, you’ll need these two configurations and a server restart if they weren’t configured when the datafiles were initialized.  The server-id is a requirement for binary logging, but more so replication.  Either way it is needed now.

Binary logs for Replication

The above picture that shows how replication flows identifies the place for both the Binary Logs and the Relay Logs.  Relay logs are only used for replication, and are only created when a MySQL Server starts acting as a replication slave.  However Binary Logs are the core of these setups.  They provide the log file that remote slave servers rely on to get the data.  So your binary logs in these cases are needed so long as your slaves still require them.  For a more complete review of replication implementation see the documentation here.

Implementation Considerations

So we’ve looked at the 2 use cases above for binary logs…  (1) Point-in-time Recovery, (2) Replication.  But if our disk space is being consumed and we notice binary log files are piling up, what can we do?

Point-in-time recovery solutions don’t vary too far from replication usage.  But here are ways of thinking through what your needs are for each solution:

Backups

  • For point-in-time recovery we only need previous binary logs to cover us between backups that have been taken.
  • It could be your strategy to make backups of binary log files by copying them off the server, possibly using something like rsync as your incremental backup strategy.
  • Sometimes a backup tool will also capture the binary logs that are present at the time such as the MySQL Enterprise Backup tool.

Replication

  • The binary logs on a master (or relay-master) are required so long as a downstream MySQL Slave server still requires them.  Otherwise your downstream slaves may need to be rebuilt
  • As noted in the configurations above, if you want to use a slave as a possible candidate as a replication Master, then you’ll need the --log-slave-updates configuration noted above.  This will cause that Slave server to write all of its received transactions from the Master into its own Binary Logs.

How can we Best Manage Growth of these binary logs?

There are a variety of ways of addressing safely both scenarios above.  At a high level it would be as follows:

  • Use configuration settings to contain/manage growth
  • Use special dedicated file systems for these files
  • Use special scripting to manage the binary logs

Configurations & Settings

In the below configurations, you can see two core adjustments we can make in the configuration files.

  1. The Binary Logs --log-bin and Relay logs --relay-log can be defined such that the files reside on a dedicated directory, which in turn can be a mount point for a dedicated filesystem.  This has 2 benefits:
    1. This can provide storage that won’t compete for space from other OS or MySQL usage
    2. If this is a storage setup such as a LVM storage setup, then it can be extended and managed in dynamic way to account for increasing needs for unexpected growth
  2. The --binlog-row-image=minimal configuration will write less data to the binary log, particularly if you have tables that have many columns and long column lengths and are only doing single column updates.  In that case, all columns that aren’t updated and that aren’t involved in the primary key will be skipped.   It may also expedite replication as there is less data to copy.
  3. The configuration expire_logs_days will purge flushed binary logs that are beyond the number of days that are configured here.  That means they can automatically rotate off.  If they are still being used by replication slaves, then they will be retained.  However, don’t set this value so low that your backup schedule can’t capture them….that is if your backup plan requires or desires them.

Scripting a Management Process

There are many approaches to managing the growth of your binary logs by using scripting.  There is more than one right way, it purely depends on your own environment.

Script Execution Methods

For whatever script you write, you could execute it by any of the following, and other approaches may be valid for your environment:

  • Manually Running scripts:
    • Running manually on a given frequency of your choosing, maybe based on a threshold of disk space taken by the binary logs
  • Use a scheduling System:
    • A scheduler of any form, such as cron on linux, or possibly an Enterprise Scheduling system that your company or business already uses for tasks like this would really be ideal.  Any form of automating the process will help to ensure that it is always handled.
    • Taking any scheduled or automated method the extra mile by ensuring that some form of notification or page goes out to support team members in case the script fails for any given reason.
  • Add it to your Backup processing:
    • By tightly integrating your binary log management process into your backups, you ensure that your fully defined process is executing as a single job flow.
  • Re-Use Current setups/Available configurations:
    • If you are running any CentOS variant Linux operating system, you could take advantage of its logrotate process, and this can make for a very simple to implement solution.
      • MySQL installs a script for CentOS-compatible Linux operating systems for the sake of rotating the error log, but the mysqladmin flush-logs command used for that will also ‘clip’ the current binary log and cause a new one to be created.
      • This default process “helps” to manage the binary logs, but might not be complete in and of itself.  This aligned along with setting  --expire_logs_days=8 configuration setting may be enough to solve many situations.
        • This solution should be aligned with your backup and recovery design too.
      • You can also customize this file and add your own scripting into it.  Just remember that you will need to ensure your MySQL commands have configuration files that provide login access for the database commands to execute properly.
    • Below is the script that is found for logrotate in CentOS at /etc/logrotate.d/mysql .

Scripting Samples to Manage the Binary Logs

There are a variety of things to consider when scripting out solutions.

  • As above, we’ve looked at how it would be executed, either manually or through some scheduling method.
  • Scripting will be different on Windows and potentially different across various Linux platforms depending on which shell they use such as Bash, KornShell, etc.
  • Another thought, how do you want to produce logs on the execution of your script, so that you can review its successes or failures and capture details to prove things worked as expected.

However, for this I’ll focus on providing core commands for managing the binary logs.  First, below you’ll see the default file that we start with using MySQL 5.7 on CentOS7.

Not much there, so lets add some configs to get binary logging running, and then restart the instance

That’s great!  Nice and easy to get binary logging going, but now what?

Well, lets test the “FLUSH LOGS” and “PURGE BINARY LOGS” commands.

A Script to Manage your Error log and your Binary logs

A Quick Script to get you going

Want something with just some core basics for managing the growth of your binary logs.  Look no further if you want something simpler.  No, I mean it….the next part of this blog gets into a more complicated scripts that does a few more things, so stop here if you want.

You’ll see in the script, it does a few things.

  1. It shows you the Binary Logs that existed before the purge
  2. It then runs the “flush logs” command, that is the heart of all this nonsense
  3. Then shows you the Binary Logs that exist after that purge….which there should be a new one.
  4. After that, it will purge the Binary Logs that are older than 3 days!  Woo hoo!   If you want it at a different interval, just change the number value listed in that PURGE command.

So let’s say you want more….

You have the following situation and things you’d like to achieve:

  1. Your OS doesn’t have a log rotation feature for the MySQL error log or you want to handle it yourself
  2. You want to combine your error log and binary log handling at the same time
  3. You want to log the activities of your script
  4. PLUS….you want the script’s execution to account for its success or failure and trigger an alert that will issue a page to your 24/7 support staff accordingly!

The good news is points 1, 2 and 3 above are covered in the sample script below. Take a look at the highlighted rows for the portions in the script that manage the binary logging itself.

The script as a whole is a bit messy looking. The logging isn’t really all that slick and I’m sure there are a ton of ways it could be improved all around. But its a pretty darn good beginning and holds potential for a reference you might want to consider your own setup.

Below is what the contents of the logging file looks like from the results of that script execution. Once again, your logging can include whatever you feel is needed. I usually have more sophisticated ways of handling logging and alerting, but its usually driven by a company’s requirements and current infrastructure that it needs to support.  This is a simpler way to get going in order to provide an example to address core requirements of managing binary logs and the error log.

I hope you all enjoyed this post and look forward to feedback and comments.