It looks the rotation of keys works fine for all OSDs; it fail at first mgr instance (mgr.y
).
2022-04-02T15:34:17.008 INFO:teuthology.orchestra.run.smithi146.stderr:++ ceph auth get-key osd.7
2022-04-02T15:34:17.365 INFO:teuthology.orchestra.run.smithi146.stdout:rotating key for mgr.y
2022-04-02T15:34:17.366 INFO:teuthology.orchestra.run.smithi146.stderr:+ NK=AQDjbEhir5HtORAAHbC3qXbxxwbs1GwYyInm2g==
2022-04-02T15:34:17.366 INFO:teuthology.orchestra.run.smithi146.stderr:+ '[' AQD1akhi07GlGhAAibYZDWhJKuspZgRDt//5vg== == AQDjbEhir5HtORAAHbC3qXbxxwbs1GwYyInm2g== ']'
2022-04-02T15:34:17.366 INFO:teuthology.orchestra.run.smithi146.stderr:+ for f in osd.0 osd.1 osd.2 osd.3 osd.4 osd.5 osd.6 osd.7 mgr.y mgr.x
2022-04-02T15:34:17.367 INFO:teuthology.orchestra.run.smithi146.stderr:+ echo 'rotating key for mgr.y'
2022-04-02T15:34:17.367 INFO:teuthology.orchestra.run.smithi146.stderr:++ ceph auth get-key mgr.y
2022-04-02T15:34:17.715 INFO:teuthology.orchestra.run.smithi146.stderr:+ K=AQD4aUhiCSlkCxAANVQf8PPD9eg8Cri81MR4eA==
2022-04-02T15:34:17.715 INFO:teuthology.orchestra.run.smithi146.stderr:+ NK=AQD4aUhiCSlkCxAANVQf8PPD9eg8Cri81MR4eA==
2022-04-02T15:34:17.716 INFO:teuthology.orchestra.run.smithi146.stderr:+ ceph orch daemon rotate-key mgr.y
The test got stuck on mgr.y
for many hours:
2022-04-02T21:32:49.555 INFO:teuthology.orchestra.run.smithi146.stderr:++ ceph auth get-key mgr.y
2022-04-02T21:32:49.910 INFO:teuthology.orchestra.run.smithi146.stderr:+ NK=AQD4aUhiCSlkCxAANVQf8PPD9eg8Cri81MR4eA==
2022-04-02T21:32:49.910 INFO:teuthology.orchestra.run.smithi146.stderr:+ '[' AQD4aUhiCSlkCxAANVQf8PPD9eg8Cri81MR4eA== == AQD4aUhiCSlkCxAANVQf8PPD9eg8Cri81MR4eA== ']'
2022-04-02T21:32:49.910 INFO:teuthology.orchestra.run.smithi146.stderr:+ sleep 5
The issue left a track in the manager's log:
2022-04-02T15:34:18.153+0000 7f7666b9b700 0 [cephadm ERROR root] Non-zero return from ['ceph', '-k', '/var/lib/ceph/mgr/ceph-y/keyring', '-n', 'mgr.y', 'tell', 'mgr.y', 'rotate-key', '-i', '-']: 2022-04-02T15:34:18.146+0000 7fdf9f7fe700 1 -- 172.21.15.146:0/952779436 <== mon.1 v2:172.21.15.146:3301/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 194+0+0 (secure 0 0 0) 0x7fdf900037b0 con 0x7fdfa0105fc0
Just the sake of verifying the issue is about rotations at ceph-mgr
instances only.
--- a/qa/suites/orch/cephadm/with-work/tasks/rotate-keys.yaml
+++ b/qa/suites/orch/cephadm/with-work/tasks/rotate-keys.yaml
@@ -3,7 +3,7 @@ tasks:
mon.a:
- |
set -ex
- for f in osd.0 osd.1 osd.2 osd.3 osd.4 osd.5 osd.6 osd.7 mgr.y mgr.x
+ for f in osd.0 osd.1 osd.2 osd.3 osd.4 osd.5 osd.6 osd.7
do
echo "rotating key for $f"
K=$(ceph auth get-key $f)
It appears the initially visible track the failure is that the mgr.y
refuses a ceph tell
CLI client, spawned by its cephadm
own module, due to lack of credentials to contact the manager.
rzarzynski@teuthology:/a/yuriw-2022-04-02_14:54:16-orch-wip-yuri11-testing-2022-04-01-0842-distro-default-smithi/6773812$ less ./remote/smithi146/log/71f5957e-b298-11ec-8c36-001a4aab830c/ceph-mgr.y.log.gz
...
2022-04-02T15:34:18.037+0000 7f7666b9b700 0 [cephadm INFO root] Rotating authentication key for mgr.y
2022-04-02T15:34:18.037+0000 7f7666b9b700 0 log_channel(cephadm) log [INF] : Rotating authentication key for mgr.y
...
2022-04-02T15:34:18.044+0000 7f7666b9b700 0 [cephadm DEBUG root] exec: ceph -k /var/lib/ceph/mgr/ceph-y/keyring -n mgr.y tell mgr.y rotate-key -i -
2022-04-02T15:34:18.074+0000 7f76d0b11700 1 -- 172.21.15.146:0/958567540 --> [v2:172.21.15.146:3301/0,v1:172.21.15.146:6790/0] -- log(3 entries from seq 332 at 2022-04-02T15:34:17.995313+0000) v1 -- 0x55646b2b3180 con 0x5564617a9400
2022-04-02T15:34:18.150
...
2022-04-02T15:34:18.150+0000 7f76d631c700 10 mgr.server ms_handle_authentication ms_handle_authentication new session 0x55646b2eaa20 con 0x55646b97b000 entity mgr.y addr
2022-04-02T15:34:18.151+0000 7f76d631c700 1 --2- v2:172.21.15.146:6800/907340359 >> 172.21.15.146:0/4073159835 conn(0x55646b97b000 0x55646b2f6c00 secure :-1 s=READY pgs=4 cs=0 l=1 rev1=1 crypto rx=0x55646b02890
0 tx=0x55646a7bbbc0 comp rx=0 tx=0).ready entity=client.15396 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2022-04-02T15:34:18.151+0000 7f769b60a700 1 -- v2:172.21.15.146:6800/907340359 <== client.15396 172.21.15.146:0/4073159835 1 ==== command(tid 0: {"prefix": "get_command_descriptions"}) v1 ==== 62+0+0 (secure 0
0 0) 0x55646af709a0 con 0x55646b97b000
2022-04-02T15:34:18.151+0000 7f769b60a700 10 mgr.server _handle_command decoded-size=1 prefix=get_command_descriptions
2022-04-02T15:34:18.151+0000 7f769b60a700 1 mgr.server log_access_denied access denied
2022-04-02T15:34:18.151+0000 7f769b60a700 0 log_channel(audit) log [INF] : from='client.15396 -' entity='mgr.y' cmd=[{"prefix": "get_command_descriptions"}]: access denied
2022-04-02T15:34:18.151+0000 7f769b60a700 -1 mgr.server reply reply (13) Permission denied access denied: does your client key have mgr caps? See http://docs.ceph.com/en/latest/mgr/administrator/#client-authentication
...
2022-04-02T15:34:18.153+0000 7f7666b9b700 0 [cephadm ERROR root] Non-zero return from ['ceph', '-k', '/var/lib/ceph/mgr/ceph-y/keyring', '-n', 'mgr.y', 'tell', 'mgr.y', 'rotate-key', '-i', '-']: 2022-04-02T15:34:18.146+0000 7fdf9f7fe700 1 -- 172.21.15.146:0/952779436 <== mon.1 v2:172.21.15.146:3301/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 194+0+0 (secure 0 0 0) 0x7fdf900037b0 con 0x7fdfa0105fc0
This particular client (ceph tell rotate-key
) is called from CephadmOrchestrator._rotate_daemon_key()
:
def _rotate_daemon_key(self, daemon_spec: CephadmDaemonDeploySpec) -> str:
self.log.info(f'Rotating authentication key for {daemon_spec.name()}')
rc, out, err = self.mon_command({
'prefix': 'auth get-or-create-pending',
'entity': daemon_spec.entity_name(),
'format': 'json',
})
j = json.loads(out)
pending_key = j[0]['pending_key']
# deploy a new keyring file
if daemon_spec.daemon_type != 'osd':
daemon_spec = self.cephadm_services[daemon_type_to_service(
daemon_spec.daemon_type)].prepare_create(daemon_spec)
CephadmServe(self)._create_daemon(daemon_spec, reconfig=True)
# try to be clever, or fall back to restarting the daemon
rc = -1
if daemon_spec.daemon_type == 'osd':
rc, out, err = self.tool_exec(
args=['ceph', 'tell', daemon_spec.name(), 'rotate-stored-key', '-i', '-'],
stdin=pending_key.encode()
)
if not rc:
rc, out, err = self.tool_exec(
args=['ceph', 'tell', daemon_spec.name(), 'rotate-key', '-i', '-'],
stdin=pending_key.encode()
)
elif daemon_spec.daemon_type == 'mds':
rc, out, err = self.tool_exec(
args=['ceph', 'tell', daemon_spec.name(), 'rotate-key', '-i', '-'],
stdin=pending_key.encode()
)
elif (
daemon_spec.daemon_type == 'mgr'
and daemon_spec.daemon_id == self.get_mgr_id()
):
rc, out, err = self.tool_exec(
args=['ceph', 'tell', daemon_spec.name(), 'rotate-key', '-i', '-'],
stdin=pending_key.encode()
)
if rc:
self._daemon_action(daemon_spec, 'restart')
return f'Rotated key for {daemon_spec.name()}'
There are two interesting paths here:
daemon_spec.daemon_id == self.get_mgr_id()
which fails because of the caps issue;- the daeamon restart which finally leads to the
mgr fail
mon commands. It's a fallback for #1.
mon.a
received the mgr fail
from mgr.y
and requested its standby:
rzarzynski@teuthology:/a/yuriw-2022-04-02_14:54:16-orch-wip-yuri11-testing-2022-04-01-0842-distro-default-smithi/6773812$ less ./remote/smithi146/log/71f5957e-b298-11ec-8c36-001a4aab830c/ceph-mon.a.log.gz
...
22-04-02T15:34:19.049+0000 7f70841e2700 2 mon.a@0(leader) e3 send_reply 0x55bddced7e00 0x55bddc92fa00 mon_command_ack([{"prefix": "mgr fail", "who": "y"}]=0 v21) v1
2022-04-02T15:34:19.049+0000 7f70841e2700 15 mon.a@0(leader) e3 send_reply routing reply to 172.21.15.146:0/958567540 via v2:172.21.15.146:3301/0 for request mon_command({"prefix": "mgr fail", "who": "y"} v 0) v1
2022-04-02T15:34:19.049+0000 7f70849e3700 1 -- [v2:172.21.15.146:3300/0,v1:172.21.15.146:6789/0] reap_dead start
2022-04-02T15:34:19.049+0000 7f70841e2700 1 -- [v2:172.21.15.146:3300/0,v1:172.21.15.146:6789/0] --> [v2:172.21.15.146:3301/0,v1:172.21.15.146:6790/0] -- route(mon_command_ack([{"prefix": "mgr fail", "who": "y"}]=0 v21) v1 tid 1258) v3 -- 0x55bddc6914a0 con 0x55bddc6fe400
2022-04-02T15:34:19.049+0000 7f70841e2700 0 log_channel(cluster) log [DBG] : mgrmap e21: x(active, starting, since 0.891945s), standbys: y
The mgr.y
instance has actually restarted:
2022-04-02T15:34:19.092+0000 7f4a904141c0 0 ceph version 17.0.0-11449-g76cd7b38 (76cd7b384b2b46da30d83ba18f236e3ddc3bdb8a) quincy (dev), process ceph-mgr, pid 7
2022-04-02T15:34:19.092+0000 7f4a904141c0 0 pidfile_write: ignore empty --pid-file
2022-04-02T15:34:19.093+0000 7f4a904141c0 1 Processor -- start
2022-04-02T15:34:19.093+0000 7f4a904141c0 1 -- start start
2022-04-02T15:34:19.093+0000 7f4a904141c0 4 mgr init Registered monc callback
2022-04-02T15:34:19.093+0000 7f4a904141c0 1 --2- >> [v2:172.21.15.171:3300/0,v1:17
However, since then there is no log entry from CephadmOrchestrator._rotate_daeamon_key()
while the test still continues
The current hypothesis is:
ceph-mgr
has a concept ofStandby mode
themgr fail
command triggers (callingself._daemon_action(daemon_spec, 'restart')
doesn't just restart the instance!),- to handle operations a module must be compatible with such a mode,
- the
cephadm
nororchestrator
modules apparently aren't:
2022-04-02T15:34:20.130+0000 7f4a904141c0 4 mgr[py] Standby mode not provided by module 'orchestrator'
...
2022-04-02T15:34:25.252+0000 7f4a904141c0 4 mgr[py] Standby mode not provided by module 'cephadm'
When analyzing the problem it has been found that:
- the missing
pending_key
rotation happens randomly and - it is associated with lack of the
KeyRing
reload that should follow themgr fail
command.
The mgr fail
command isn't a replacement for a daemon restart in any single case.
A successful KeyRing
reload sequence happens as a result of MgrStandby::respawn()
called from MgrStandby::handle_mgr_map()
.
2022-05-16T20:25:42.541+0200 7f1e847fb700 -1 mgr handle_mgr_map I was active but no longer am
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn e: '/home/rzarz/dev/ceph-2/build/bin/ceph-mgr'
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn 0: '/home/rzarz/dev/ceph-2/build/bin/ceph-mgr'
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn 1: '-i'
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn 2: 'x'
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn 3: '-c'
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn 4: '/home/rzarz/dev/ceph-2/build/ceph.conf'
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn respawning with exe /home/rzarz/dev/ceph-2/build/bin/ceph-mgr
2022-05-16T20:25:42.541+0200 7f1e847fb700 1 mgr respawn exe_path /proc/self/exe
...
2022-05-16T20:25:42.609+0200 7f9022a84000 2 auth: KeyRing::load: loaded key file /home/rzarz/dev/ceph-2/build/dev/mgr.x/keyring this=0x7fff97afa940
2022-05-16T20:25:42.609+0200 7f9022a84000 2 auth: KeyRing::load: loaded key file /home/rzarz/dev/ceph-2/build/dev/mgr.x/keyring
2022-05-16T20:25:42.609+0200 7f9022a84000 2 auth: KeyRing this=0x7fff97afa940secret=AQDhloJifhkjCBAA5wPoYBrjskFYPK1IvXR4OQ==
However, this path is conditional and, in some situations, the respawn can be skipped:
void MgrStandby::handle_mgr_map(ref_t<MMgrMap> mmap)
{
// ..
bool need_respawn = py_module_registry.handle_mgr_map(map);
if (need_respawn) {
dout(1) << "respawning because set of enabled modules changed!" << dendl;
respawn();
}
if (active_in_map) {
if (!active_mgr) {
dout(1) << "Activating!" << dendl;
active_mgr.reset(new Mgr(&monc, map, &py_module_registry,
client_messenger.get(), &objecter,
&client, clog, audit_clog));
active_mgr->background_init(new LambdaContext(
[this](int r){
// Advertise our active-ness ASAP instead of waiting for
// next tick.
std::lock_guard l(lock);
send_beacon();
}));
dout(1) << "I am now activating" << dendl;
} else {
dout(10) << "I was already active" << dendl;
bool need_respawn = active_mgr->got_mgr_map(map);
if (need_respawn) {
respawn();
}
}
// ...
} else if (active_mgr != nullptr) {
derr << "I was active but no longer am" << dendl;
respawn();
} else {
// ...
}
}
This has been observed altogether with the correlation to the key rotation stall:
2022-05-17T00:10:11.197+0200 7fa3d1e0a700 10 mgr handle_mgr_map I was already active
There is a race condition between the mgr fail
handling and mgrbeacon
.
diff --git a/src/mon/MgrMonitor.cc b/src/mon/MgrMonitor.cc
index 8ada44e2628..9000b2e0687 100644
--- a/src/mon/MgrMonitor.cc
+++ b/src/mon/MgrMonitor.cc
@@ -1203,7 +1203,9 @@ bool MgrMonitor::prepare_command(MonOpRequestRef op)
}
if (changed && pending_map.active_gid == 0) {
+ dout(5) << "========== changed and active_state == 0" << dendl;
promote_standby();
+ dout(5) << "========== after promote_standby: " << pending_map.active_gid << dendl;
}
} else if (prefix == "mgr module enable") {
string module;
2022-05-17T00:11:19.602+0200 7f6bd5769700 0 mon.a@0(leader) e1 handle_command mon_command({"prefix": "mgr fail", "who": "x"} v 0) v1
...
2022-05-17T00:11:19.614+0200 7f6bd5769700 5 mon.a@0(leader).mgr e25 ========== changed and active_state == 0
2022-05-17T00:11:19.614+0200 7f6bd5769700 5 mon.a@0(leader).mgr e25 ========== after promote_standby: 0
2022-05-17T00:11:19.614+0200 7f6bd5769700 4 mon.a@0(leader).mgr e25 prepare_command done, r=0
...
2022-05-17T00:11:19.630+0200 7f6bd5769700 4 mon.a@0(leader).mgr e25 selecting new active 4210 x (was 0 )
bool MgrMonitor::prepare_beacon(MonOpRequestRef op)
if (pending_map.active_gid == m->get_gid()) {
// ...
} else if (pending_map.active_gid == 0) {
// There is no currently active daemon, select this one.
if (pending_map.standbys.count(m->get_gid())) {
drop_standby(m->get_gid(), false);
}
dout(4) << "selecting new active " << m->get_gid()
<< " " << m->get_name()
<< " (was " << pending_map.active_gid << " "
<< pending_map.active_name << ")" << dendl;
pending_map.active_gid = m->get_gid();
pending_map.active_name = m->get_name();
pending_map.active_change = ceph_clock_now()
The 25
version of MgrMap
, when handled at mgr.x
, doesn't trigger the respawn()
path:
2022-05-17T00:10:11.197+0200 7fa3d1e0a700 10 mgr ms_dispatch2 active mgrmap(e 25) v1
2022-05-17T00:10:11.197+0200 7fa3d1e0a700 4 mgr handle_mgr_map received map epoch 25
2022-05-17T00:10:11.197+0200 7fa3d1e0a700 4 mgr handle_mgr_map active in map: 1 active is 4210
2022-05-17T00:10:11.197+0200 7fa3d6613700 10 --2- 127.0.0.1:0/743576734 >> [v2:127.0.0.1:40929/0,v1:127.0.0.1:40930/0] conn(0x5592635ef400 0x5592635f6580 secure :-1 s=THROTTLE_DONE pgs=130 cs=0 l=1 rev1=1 crypto rx=0x55926362e810 tx=0x559263563b60 comp rx=0 tx=0).handle_read_frame_dispatch tag=17
2022-05-17T00:10:11.197+0200 7fa3d6613700 5 --2- 127.0.0.1:0/743576734 >> [v2:127.0.0.1:40929/0,v1:127.0.0.1:40930/0] conn(0x5592635ef400 0x5592635f6580 secure :-1 s=THROTTLE_DONE pgs=130 cs=0 l=1 rev1=1 crypto rx=0x55926362e810 tx=0x559263563b60 comp rx=0 tx=0).handle_message got 43089 + 0 + 0 byte message. envelope type=1796 src mon.0 off 0
2022-05-17T00:10:11.197+0200 7fa3d1e0a700 10 mgr handle_mgr_map I was already active
Local replicator