Skip to content

Instantly share code, notes, and snippets.

@ziesemer
Last active September 7, 2021 20:55
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ziesemer/93d64a074abdd8e6f632 to your computer and use it in GitHub Desktop.
Save ziesemer/93d64a074abdd8e6f632 to your computer and use it in GitHub Desktop.
ssh-ControlMaster-test.sh
#!/bin/bash
# Mark Ziesemer, 2016-02-11, 2016-12-14.
# As described at https://rhn.redhat.com/errata/RHSA-2015-2088.html ,
# I'm afraid that the race condition with OpenSSH ControlMaster multiplexing is still not resolved
# in recent CentOS / Fedora releases.
# This refers to BZ#1240613 (which is apparently restricted), and is also described at
# https://access.redhat.com/solutions/1521923 (which is non-public, restricted to subscription access).
# Reported to https://bugzilla.redhat.com/show_bug.cgi?id=1308295 on 2016-02-13 .
# Use this script to stress-test OpenSSH ControlMaster multiplexing.
# Open 2 shell sessions.
# In the first, run "ssh-ControlMaster-test.sh setup" (needed one-time only), followed by "ssh-ControlMaster-test.sh master".
# In the second, run "ssh-ControlMaster-test.sh threads".
# If an error is not observed, repeat running the "threads" command in the 2nd window until the "master" in the 1st window
# terminates with an error, or until confident that the issue no longer exists.
# This doesn't require any networking outside of a local VM, and has been observed under both VMware ESX and Oracle VirtualBox
# - but is also being consistently observed in actual network environments.
# As of 2016-02-13, my current testing shows:
# - CentOS 7.2.1511, openssh-6.6.1p1-23.el7_2.x86_64 - broken.
# - CentOS 7.2.1511, downgraded to openssh-6.6.1p1-22.el7.x86_64 - still broken
# (despite this is the supposed fix release per RHSA-2015-2088).
# - Fedora 23, openssh-7.1p2-3.fc23.x86_64 - broken.
# - Ubuntu 15.10, OpenSSH_6.9p1 Ubuntu-2ubuntu0.1 - Works without issue.
# - CentOS 6.6, openssh-5.3p1-104.el6_6.1.i686 - Works without issue.
# - Fedora 20, openssh-6.4p1-8.fc20.x86_64 - Works without issue.
# Additional research and notes:
# - https://ahwhattheheck.wordpress.com/2015/07/02/debugging-sporadically-encountered-ssh-encountered-an-unknown-error-in-ansible-runs/
# - Bypassed the issue by effectively ensuring that no ControlMaster would be concurrently accessed by multiple client sessions,
# at the expense of increasing the number of ControlMasters used.
# - http://www.zenoss.org/forum/10136
# - Posts indicated that the new "UsePrivilegeSeparation sandbox" could be a problem here - but I am able to consistently
# reproduce with or without this enabled.
set -euo pipefail
trap '_exit' SIGINT
_controlPath="-o ControlPath=~/.ssh/sockets/%r@%h-%p"
_host='localhost'
_thread(){
for i in {1..100}; do
ssh ${_controlPath} "${_host}" \
-C "echo here: thread=$1 iter=$i \$(date -Is); sleep 0.1" \
|| {
echo "ssh client (thread=$1 iter=$i) failed with result: $?"
_exit
}
done
}
_exit(){
echo 'Forcing exit...'
kill $(jobs -p) 2>/dev/null
}
_setup(){
set -vx
ssh-keygen -f ~/.ssh/id_rsa -N ''
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
ssh-keyscan "${_host}" >> ~/.ssh/known_hosts
mkdir -p -- ~/.ssh/sockets
chmod 700 -- ~/.ssh ~/.ssh/sockets
}
_runMaster(){
date -Is
local sshResult=
ssh -vvvv -o 'ControlMaster=yes' ${_controlPath} \
-N "${_host}" || sshResult=$?
echo "ssh ControlMaster failed with result: $sshResult"
date -Is
}
_runThreads(){
for i in {1..10}; do
_thread $i &
done
wait
}
case "$1" in
'setup')
_setup
;;
'master')
_runMaster
;;
'threads')
_runThreads
;;
esac
@ziesemer
Copy link
Author

ziesemer commented Apr 7, 2016

Still broken as of openssh-6.6.1p1-25.el7_2.x86_64 under CentOS 7.2.1511.

@ziesemer
Copy link
Author

ziesemer commented Dec 14, 2016

Further testing required - but this now appears fixed as of openssh-6.6.1p1-31.el7.x86_64 under CentOS 7.3.1611! I even cranked the script up to 1,000 iterations x 50 threads, and was unable to cause a ControlMaster failure. 😄

Looks like the fix was actually in -26 (which was never yet released for 7.2):

* Fri Apr 01 2016 Jakub Jelen <jjelen@redhat.com> 6.6.1p1-26 + 0.9.3-9
...
- Fix race condition between audit messages from different processes (#1310684)
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment