Dynamic Replication Channel Specific Filtering for uses like InnoDB Cluster

By | June 21, 2018

There are numerous ways to filter your replication stream in MySQL.  So many, that the documentation team has created guided “walk-through” pages for considering the various complex DB Filtering and subsequent Table Filtering paths folks have used for the “global” filtering options.  But now there are “channel specific” options that can be implemented at startup with enhanced channel specific filtering configurations or dynamically using the new CHANGE  REPLICATION FILTER statements now available in MySQL 8.0.

Use Cases for Replication Filtering

Many MySQL use-cases out there involve replication.  Many of those replication setup involve the use of replication filtering patterns too.  They may be business driven decisions such as “certain types of data should never leave a given Data Centre wheras other data can” or “certain tables are so large and only have a local value, therefore replicating those large tables off-site along with other important data is not practical nor necessary“.

InnoDB Cluster’s Dual DC Use Case

Background

When Group Replication was first introduced, it’s core diagnostics were all self-contained in the performance_schema and still are.  All members in the group knew what was going on through the distribution enabled Group Communication System.  No external tools were part of the solution.  This also means that you need to construct your own health check function or process to determine which members are available for RW or RO or are just NOT available.

When InnoDB Cluster was introduced in MySQL 5.7, the mysql-shell constructed a meta-data schema named mysql_innodb_cluster_metadata .  So all that data would get replicated across all members, and our mysqlrouter component would then use that to track the status of the group and identify who were the valid RW/RO members.  So what’s the big deal?  (you ask)

The problem to Solve

There is a strong case for having 2 InnoDB Clustered setups

  • (1) One in a primary data center and another in a remote DR or Passive environment
    • Possibly only writing to your primary DC, and DR is only for failover/disaster readiness
    • With setups like that you may need to have intermittent non-disruptive testing that replication can feed from your DR back to your primary site to prove your DR is functionally ready
    • Having Circular Replication running becomes very important, even if its “not always” running.
  • (2) Having 2 data centre locations that are both active in receiving non-conflicting regional “write” traffic
    • Circular replication is definitely needed here
    • Either site can act as a DR target for another in case a regional outage occurs
      • Basically only the failed region’s application servers need to now serve the other region’s traffic as the databases are already prepared and operating
    • Note: considerations for upgrade/maintenance outages need to be considered for fully active dual-DC setup such as is proposed just above

All that being noted above for those 2 prospective scenarios, the  mysql_innodb_cluster_metadata schema can NOT be replicated into the other’s InnoDB Clustered setup!  It would corrupt the metadata contained in that system.

The Remedy in MySQL 5.7

The solution in MySQL 5.7 is to use intermediary pass-through slaves where their whole job is to filter out that metadata schema noted above. As only global replication filters were available, there was no other way to address it.

The Remedy in MySQL 8.0

In MySQL 8.0 though, replication channel based filtering becomes the magic bullet.

  • Group Replication (the heart of InnoDB Clustering) uses replication channels for applier activities and recovery of members joining…so they need to operate without disruption or filtering.
  • For any InnoDB Cluster that want to replicate to another InnoDB Clustered setup:
    • A Slave’s REPLICATION FILTER only needs to be setup on “write-capable” primary member in the recieving InnoDB Cluster.
    • Then and use this new filtering to remove the   mysql_innodb_cluster_metadata schema.  That way no conflicting metadata will be added from the foreign cluster’s schema!!!

Implementing the Solution

We still have the cluster that was built in my prior blog on MySQL 8.0 InnoDB Clusters so we’ll re-use those Servers name gr127, gr128, gr129 – where gr127 is the single write primary.  NEW to the setup – I have created a MySQL 8 Server instance named gr132 which will represent the slave member creating the “channel specific replication filter” setup, filtering out that metadata schema.

Creating the Replication Account

Although this is a probably just a simple task, it is a reminder about how we create replication user accounts and where they need to be created.  Below gr127  is the server we will slave from, so the account needs to be created there.  Plus once added there, it’ll propagate to all other members of the InnoDB Clustered setup.

Establish the Replication Channel & Filtering

There are a few steps here:

  1. Create the Replication channel using a CHANGE MASTER statement
  2. Create the associated Replication Filter using a CHANGE REPLICATION FILTER statement
  3. Start slaving on that channel using the standard auto-starting on startup START SLAVE statement.

Verifying the Replication Filtering is Set

It’s not worth anything if we can’t verify the configurations we intended to set, are actually set.

First verification:  Did it skip replicating the the mysql_innodb_cluster_metadata ?

YES – it is evident below in the SHOW DATABASES  output that the database was not replicated

Second verification: Can we confirm that the configuration of skipping that database is logged so we can see it?

YES – We simply actually just need to look above more closely in the SHOW SLAVE STATUS  … output
YES – As below, we can note that replication_applier_filters  column output for CONFIGURED_BY  .

Persisting Our Replication Channel Specific Filtering

Last but not least…

We cannot use SET PERSIST GLOBAL  variable implementation as these --replicate* configurations are actually OPTION file items only and are meant to only be used in configuration files or on the command line when starting the server.

That being said, we do have 1 solid way noted above for DYNAMICALLY enabling REPLICATION CHANNEL SPECIFIC filters PLUS another way to persist them over server restarts.  And in both cases we have a way to tell which means of implementing set it up…which is really cool!

Results when CHANNEL SPECIFIC FILTERING is Dynamic

Below in the sample output, we can see the value for CONFIGURED_BY which indicates it was enabled dynamically when configuring the replication setup using SQL.

 

Results when CHANNEL SPECIFIC FILTERING is from Server Startup

Below, I show the configuration I use in my /etc/my.cnf file.  And then I’ll show the results after a restart.

Closing a Replication Loop


For the sake of closing a literal loop.  When using an InnoDB Cluster member as a standard replication slave, it must ALWAYS be a “write-capable” primary member in the cluster.

In this case gr127 is still the write-primary in my setup and we’re now going to configure it to be a replication slave off of server gr132, which will bring the 2 servers into a circular replication setup.   To help that along, the configurations would be as follows below in the code section.

Please follow and like us:
error