Skip to content

Instantly share code, notes, and snippets.

@danyell
Last active February 7, 2021 14:05
Show Gist options
  • Save danyell/36510855b4a09127314ac5dd0a48538c to your computer and use it in GitHub Desktop.
Save danyell/36510855b4a09127314ac5dd0a48538c to your computer and use it in GitHub Desktop.
Script to auto-repair MySQL replication slave synch-loss
#!/bin/bash
# Script to auto-repair MySQL replication slave synch-loss
# by: Daniel de Segovia Gross (daniel@hubrix.co)
# v1.0 07 Mar 2017
# Copyright 2017 Hubrix SAS. https://www.hubrix.co/
# License: MIT (see https://opensource.org/licenses/MIT)
# USE AT YOUR OWN RISK. Don't be a copy+paste victim!
# If you don't understand this script, you should NOT use it.
# It is intended to save you time, not replace your brain.
#
# This script just automates a HOWTO, see for instance:
# http://thelinuxstuff.blogspot.fr/2012/11/post-titlemysql-tips-how-to-fix.html
#
# NOTE: This script AUTOMATES A BAD PRACTICE.
# You really should switch to MySQL clustering vs. replication,
# or use proper transactions in your app SQL, to prevent synch
# issues from occurring in the first place.
#
# Usage: make this script executable (by root only), add to /etc/cron.d
# (I run it every 20 minutes)
#
# Script assumes:
# 1. Run as root
# 2. root has a legit /root/.my.cnf file configured
# to allow mysql login without providing password
#
MYSQLBIN=/usr/bin/mysql
SQLSLAVESKIP=1
# Beyond some number of skips DB integrity may be compromised,
# and our little first aid is probably not good enough. Bail
# and alert the medics.
SLAVESKIPMAX=60
# Flag variable for if we did anything at all
DIDAAA=0
SSR=$(echo ' SHOW SLAVE STATUS \G' | $MYSQLBIN | grep 'Slave_SQL_Running:' |awk '{print $2}')
while [ "$SSR" != "Yes" ]
do
if [ $SQLSLAVESKIP -gt $SLAVESKIPMAX ] ; then
logger -i -s -p cron.crit "CRITICAL MySQL Slave Sync error, cannot repair"
exit 1
fi
DIDAAA=1
MYSQLSLAVECMD="stop slave;\nSET GLOBAL SQL_SLAVE_SKIP_COUNTER = $SQLSLAVESKIP;\nstart slave;"
echo -e $MYSQLSLAVECMD | $MYSQLBIN
SQLSLAVESKIP=$(expr $SQLSLAVESKIP + 1)
SSR=$(echo ' SHOW SLAVE STATUS \G' | $MYSQLBIN | grep 'Slave_SQL_Running:' |awk '{print $2}')
done
if [ $DIDAAA -eq 1 ] ; then
# Get seconds behind Master for log record
SBM=$(echo ' SHOW SLAVE STATUS \G' | $MYSQLBIN | grep 'Seconds_Behind_Master:' |awk '{print $2}')
logger -i -s -p cron.warn "MySQL Slave Sync repaired, skipped $SQLSLAVESKIP queries, now $SBM secs behind Master"
fi
# Else nothing happened, STFU!
@kawaiipantsu
Copy link

kawaiipantsu commented Sep 2, 2020

Just a quick note on line 43 you run the following on your mysql server SET GLOBAL SQL_SLAVE_SKIP_COUNTER = $SQLSLAVESKIP; but you also increment the value of $SQLSLAVESKIP, so next iteration you will skip 2 sql queries from your relay log and next again 3 positions and so on. That is not good! You could lose legit data. You should only skip 1 always.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment