throttling and changing configuration at runtime

Online schema change with gh-ost - throttling and changing configuration at runtime

In previous posts, we gave an overview of gh-ost and showed you how to test your schema changes before executing them . One important feature of all schema change tools is their ability to throttle themselves. Online schema change requires copying data from old table to a new one and, no matter what you do in addition to that, it is an expensive process which may impact database performance.

Throttling in gh-ost

Throttling is crucial to ensure that normal operations continue to perform in a smooth way. As we discussed in a previous blog post , gh-ost allows to stop all of its activity, which makes things so much less intrusive. Let’s see how it works and to what extent it is configurable.

First things first - what does gh-ost monitor? As we know, by default, gh-ost uses a master to execute writes, and a slave to track changes in binary logs. The master, obviously, will not give us any information about replication lag, but a slave will do - that’s where gh-ost gets its data on slave lag. Of course, one single slave is not necessarily representative of the whole replication chain. Therefore it is possible to define a list of slaves to check the replication lag via --throttle-control-replicas variable. All you need to do is to pass a comma-separated list of IP’s here and gh-ost will track lag on all of them. You can define what maximum lag is acceptable for you using --max-lag-millis . Once the threshold has been passed, gh-ost will stop its activity and allow slaves to catch up with the master.

The main problem is that, right now, gh-ost uses multiple methods of lag calculation, which make things not really clear. The documentation is also not clear enough to clarify how things work internally. Let’s take a look at how gh-ost operates right now.

As we mentioned, there are multiple methods used to calculate lag. First of all, gh-ost generates an internal heartbeat in its _ghc table.

It is used to calculate lag on the slave/replica, on which gh-ost operates and reads binary logs from. Then, replicas are mentioned in --throttle-control-replicas . Those, by default, have their lag tracked using SHOW SLAVE STATUS and Seconds_Behind_Master . This data has the granularity of one second.

The problem is that sometimes, one second of lag is too much for the application to handle, therefore one of the very important features of gh-ost is to be able to detect sub-second lag. On the replica, where gh-ost operates, gh-ost’s heartbeat supports sub-second granularity using heartbeat-interval-millis variable. The remaining replicas, though, are not supported this way - there is an option to take advantage of an external heartbeat solution like, for example, pt-heartbeat, and calculate slave lag using --replication-lag-query .

Unfortunately, when we put it all together, it didn’t work as expected - sub-second lag was not calculated correctly by gh-ost. We decided to contact Shlomi Noah, who’s leading the gh-ost project, to get some more insight in how gh-ost operates regarding to sub-second lag detection. What you will read below is a result of this conversation, showing how it will be done in the future, in the “right” way.

gh-ost, at this moment, inserts heartbeat data in its _*_ghc table. This makes any external heartbeat generator redundant and, as a result, it makes --replication-lag-query deprecated and soon to be removed. Once it will be removed, gh-ost’s internal heartbeat will be used across the whole replication topology.

If you will want to check for lag with sub-second granularity, you will need to configure correctly --heartbeat-interval-millis and --max-lag-millis ensuring that heartbeat-interval-millis is set to lower value than max-lag-millis - that’s all. You can, for example, tell gh-ost to insert a heartbeat every 100 milliseconds ( heartbeat-interval-millis ) and then test if lag is less than, let’s say 500 milliseconds ( max-lag-millis ). Of course, lag will be checked on all replicas defined in --throttle-control-replicas . You can see updated documentation related to the lag checking process here: https://github.com/github/gh-ost/blob/3bf64d8280b7cd639c95f748ccff02e90a7f4345/doc/subsecond-lag.md

Please keep in mind that this is how gh-ost will operate when you use it in version v1.0.34 or later.

We need to mention, for a sake of completeness, one more setting - nice-ratio . It is used to define how aggressive gh-ost should be in copying the data. It basically tells ghost how much should it pause after each row copy operation. If you set it to 0 - no pause will be added. If you set it to 0.5, the whole process of copying rows will take 150% of original time. If you set it to 1, it will take twice as long (200%). It works but it is also pretty hard to adjust the ratio so the original workload is not affected. As long as you can use sub-second lag throttling, this is the way to go.

Runtime configuration changes in gh-ost

Another very useful feature of gh-ost is its ability to handle runtime configuration changes. When it starts, it listens on the unix socket, which you can choose through --serve-socket-file . By default it is created in /tmp dir and name is determined by gh-ost. It seems like it depends on the schema and table which gh-ost works upon. An example would be: /tmp/gh-ost.sbtest1.sbtest1.sock

Gh-ost can also work using TCP port but for that you need to pass --serve-tcp-port .

Knowing this, we can manipulate some of the settings. The best way to learn what we can change would be to ask gh-ost about it. When we send the ‘help’ string to the socket, we’ll get a list of available commands:

root@ip-172-30-4-235:~# echo help | nc -U /tmp/gh-ost.sbtest1.sbtest1.sock available commands: status # Print a detailed status message sup # Print a short status message chunk-size= # Set a new chunk-size nice-ratio= # Set a new nice-ratio, immediate sleep after each row-copy operation, float (examples: 0 is agrressive, 0.7 adds 70% runtime, 1.0 doubles runtime, 2.0 triples runtime, ...) critical-load= # Set a new set of max-load thresholds max-lag-millis= # Set a new replication lag threshold replication-lag-query= # Set a new query that determines replication lag (no quotes) max-load= # Set a new set of max-load thresholds throttle-query= # Set a new throttle-query (no quotes) throttle-control-replicas= # Set a new comma delimited list of throttle control replicas throttle # Force throttling no-throttle # End forced throttling (other throttling may still apply) unpostpone # Bail out a cut-over postpone; proceed to cut-over panic # panic and quit without cleanup help # This message

As you can see, there is a bunch of settings to change at runtime - we can change chunk size, we can change critical load settings (when defined thresholds will cross, causing gh-ost to start to throttle). You can also set settings related to throttling: nice-ratio , max-lag-millis , replication-lag-query , throttle-control-replicas . You can as well force throttling by sending the ‘throttle’ string to gh-ost or immediately stop the migration by sending ‘panic’.

Another setting which is worth mentioning is unpostpone . Gh-ost allows you to postpone the cutover process. As you know, gh-ost creates a temporary table using the new schema, and then fills it with data from the old table. Once all data has been copied, it performs a cut-over and replaces the old table with a new one. It may happen that you want to be there to monitor things, when gh-ost performs this step - in case something goes wrong. In that case, you can use --postpone-cut-over-flag-file to define a file which, if exists, will postpone the cut-over process. Then you can create that file and be sure that gh-ost won’t swap tables unless you let it by removing the file. Still, if you’d like to go ahead and force cut-over without a need to find and remove the postpone file, you can send ‘unpostpone’ string to gh-ost and it will immediately perform a cut-over.

We coming to the end of this post. Throttling is a critical part of any online schema change process (or any database-heavy process, for that matter) and it is important to understand how to do it right. Yet, even with throttling, some additional load is unavoidable That’s why, in our next blog post, we will try to assess the impact of running gh-ost on the system.