Skip to content

Instantly share code, notes, and snippets.

Created September 30, 2016 19:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/a9e533a2646b70429b00287fd6245b6b to your computer and use it in GitHub Desktop.
Save anonymous/a9e533a2646b70429b00287fd6245b6b to your computer and use it in GitHub Desktop.
Salt trouble StolenToast
I can run a state from the master on a minion. The minion log clearly indicates it received the command to run, but then it can't return results back to the master.
I can also run a state from a minion. The master seems to receive part of the message, and renders the pillar in anticipation. But then the minion seems to hang on these two messages:
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
And the master, after a short "accepted authentication request from <nminion>" shows no related output. Eventually the job times out on the minion. Maybe the fileserver port is blocked, though I have no idea how that could have happened:
salt.exceptions.SaltReqTimeoutError: Message timed out
If the minion is already running a state and I try to run a new one then the master will receive the appropriate "state.sls already running" message.
I've seen the message about "maybe the master is not responding/is busy". I tried increasing the worker_threads but there was no difference. Also this configuration has been working for months now on an even larger environment without becoming bogged-down, since it's hosted on a pretty powerful server. It even gives me that message when I reduce my whole salt env to one minion, and there's no way it's overloaded trying to run a single pkg state on one minion.
Some things do return: grains.item will always work, as does test.ping.
Here's what happens if I try to run status.cpuinfo:
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'qcdi1401.jlab.org_master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] LazyLoaded local_cache.get_load
[DEBUG ] get_iter_returns for jid 20160930150148497574 sent to set(['farm10004.jlab.org']) will timeout at 15:01:53.502429
[DEBUG ] Checking whether jid 20160930150148497574 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'qcdi1401.jlab.org_master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] LazyLoaded no_return.output
Why is that 127.0.0.1 there?
[DEBUG ] Sending event - data = {'fun_args': ['20160930143811973252'], 'jid': '20160930143836997315', 'return': {'tgt_type': 'glob', 'jid': '20160930143811973252', 'tgt': 'farm10004*', 'pid': 19212, 'ret': '', 'user': 'sudo_seitz', 'arg
': ['software.mesa'], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'cmd': '_return', '_stamp': '2016-09-30T18:38:37.130360', 'fun': 'saltutil.find_job', 'id': 'farm10004.jlab.org'}
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:47.014065', 'minions': ['farm10004.jlab.org']}
[DEBUG ] Sending event - data = {'tgt_type': 'glob', 'jid': '20160930143847013734', 'tgt': 'farm10004*', '_stamp': '2016-09-30T18:38:47.014500', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job', 'minions
': ['farm10004.jlab.org']}
[INFO ] User sudo_seitz Published command saltutil.find_job with jid 20160930143847013734
[DEBUG ] Published command details {'tgt_type': 'glob', 'jid': '20160930143847013734', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
...
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:42:12.267398', 'pretag': None, 'cmd': '_minion_event', 'tag': '_salt_error', 'data': {'message': 'The minion function caused an exception', 'args': ['The minion function caused a
n exception'], '_stamp': '2016-09-30T18:42:12.254144'}, 'id': 'farm10004.jlab.org'}
...
[INFO ] Authentication request from farm10004.jlab.org
[INFO ] Authentication accepted from farm10004.jlab.org
[DEBUG ] Sending event - data = {'id': 'farm10004.jlab.org', '_stamp': '2016-09-30T18:38:05.564489', 'result': True, 'pub': '-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzd4x1Ak47bidlxAR1BhA\nMLkQdc+agRNNUh2d0
KkvS/vUV07JFRsqMA9za0M/prPfZ7PNhPq6pcR3jSAXoWeQ\nwjFP93Khw/z7U5FDWrKGRfkhM+QIbVbgBBlduqx//8BevV5MRqfhEvxitsAdBNHd\nXWWer0tZ9A+7wbj0wHebnmeVy0qkbtvVVBWF6uaWn8LV7v/g4osCYtoVRODh6I3w\nMg3vEIjGThwE48yygilj65NkLAEPXZn/5T1d9vk2u4aMl/pAJuDjeDyzI
muF2mua\nk3wp0gVAfyN74RNFwQZYMznvwupwgeKfsOJ3S893Ut+wLbQlh8QYXdMIsRnHuZj+\nbQIDAQAB\n-----END PUBLIC KEY-----', 'act': 'accept'}
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] compile template: /srv/salt/pillar/top.sls
[DEBUG ] Jinja search path: ['/srv/salt/pillar']
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/top.sls' using 'jinja' renderer: 0.00368285179138
[DEBUG ] Rendered data from file: /srv/salt/pillar/top.sls:
base:
'*':
- ssh
- ntp
- nagios
[DEBUG ] LazyLoaded config.get
[DEBUG ] Results of YAML rendering:
OrderedDict([('base', OrderedDict([('*', ['ssh', 'ntp', 'nagios'])]))])
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/top.sls' using 'yaml' renderer: 0.0463349819183
[DEBUG ] compound_match: farm10004.jlab.org ? *
[DEBUG ] compound_match farm10004.jlab.org ? "*" => "True"
[DEBUG ] compile template: /srv/salt/pillar/ssh.sls
[DEBUG ] Jinja search path: ['/srv/salt/pillar']
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/ssh.sls' using 'jinja' renderer: 0.00108504295349
[DEBUG ] Rendered data from file: /srv/salt/pillar/ssh.sls:
ssh_config: [165/1490]
ForwardAgent: yes
ForwardX11: yes
ForwardX11Trusted: yes
RSAAuthentication: yes
RhostsRSAAuthentication: yes
PasswordAuthentication: yes
HostbasedAuthentication: yes
GSSAPIAuthentication: yes
#GSSAPIDelegateCredentials: no
#GSSAPIKeyExchange: no
#GSSAPITrustDNS: no
BatchMode: no
CheckHostIP: yes
AddressFamily: 'any'
ConnectTimeout: 0
StrictHostKeyChecking: 'ask'
#IdentityFile: '~/.ssh/id_dsa'
Port: 22
Protocol: 2
#Cipher: 3des
#Ciphers: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc
#MACs: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-ripemd160
#EscapeChar: ~
Tunnel: no
TunnelDevice: 'any:any'
PermitLocalCommand: no
VisualHostKey: no
EnableSSHKeysign: yes
[DEBUG ] Results of YAML rendering:
OrderedDict([('ssh_config', OrderedDict([('ForwardAgent', True), ('ForwardX11', True), ('ForwardX11Trusted', True), ('RSAAuthentication', True), ('RhostsRSAAuthentication', True), ('PasswordAuthentication', True), ('HostbasedAuthenticatio
n', True), ('GSSAPIAuthentication', True), ('BatchMode', False), ('CheckHostIP', True), ('AddressFamily', 'any'), ('ConnectTimeout', 0), ('StrictHostKeyChecking', 'ask'), ('Port', 22), ('Protocol', 2), ('Tunnel', False), ('TunnelDevice',
'any:any'), ('PermitLocalCommand', False), ('VisualHostKey', False), ('EnableSSHKeysign', True)]))])
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/ssh.sls' using 'yaml' renderer: 0.00510287284851
[DEBUG ] compile template: /srv/salt/pillar/ntp.sls
[DEBUG ] Jinja search path: ['/srv/salt/pillar']
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/ntp.sls' using 'jinja' renderer: 0.00097393989563
[DEBUG ] Rendered data from file: /srv/salt/pillar/ntp.sls:
ntp:
ntp_conf: salt://ntp/ntp-client.conf
stepticker_conf: salt://ntp/stepticker.conf
lookup:
client: ntp
server: ntpd
service: ntpd
ntpdate: ntpdate
ntp_conf: /etc/ntp.conf
stepticker_conf: /etc/stepticker.conf
ntpd_conf: /etc/ntpd.conf
servers: ["qcdadm2.jlab.org","sca1401.jlab.org"]
tickers: ["qcdadm2.jlab.org","sca1401.jlab.org"]
[DEBUG ] Results of YAML rendering:
OrderedDict([('ntp', OrderedDict([('ntp_conf', 'salt://ntp/ntp-client.conf'), ('stepticker_conf', 'salt://ntp/stepticker.conf'), ('lookup', OrderedDict([('client', 'ntp'), ('server', 'ntpd'), ('service', 'ntpd'), ('ntpdate', 'ntpdate'), (
'ntp_conf', '/etc/ntp.conf'), ('stepticker_conf', '/etc/stepticker.conf'), ('ntpd_conf', '/etc/ntpd.conf')])), ('servers', ['qcdadm2.jlab.org', 'sca1401.jlab.org']), ('tickers', ['qcdadm2.jlab.org', 'sca1401.jlab.org'])]))])
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/ntp.sls' using 'yaml' renderer: 0.00352597236633
[DEBUG ] compile template: /srv/salt/pillar/nagios.sls
[DEBUG ] Jinja search path: ['/srv/salt/pillar']
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/nagios.sls' using 'jinja' renderer: 0.000830888748169
[DEBUG ] Rendered data from file: /srv/salt/pillar/nagios.sls:
nagios:
master: 129.57.100.22
[DEBUG ] Results of YAML rendering:
OrderedDict([('nagios', OrderedDict([('master', '129.57.100.22')]))])
[PROFILE ] Time (in seconds) to render '/srv/salt/pillar/nagios.sls' using 'yaml' renderer: 0.000947952270508
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:06.728452', 'pretag': None, 'cmd': '_minion_event', 'tag': 'minion_start', 'data': 'Minion farm10004.jlab.org started at Fri Sep 30 14:38:06 2016', 'id': 'farm10004.jlab.org'}
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:06.735992', 'pretag': None, 'cmd': '_minion_event', 'tag': 'salt/minion/farm10004.jlab.org/start', 'data': 'Minion farm10004.jlab.org started at Fri Sep 30 14:38:06 2016', 'id
': 'farm10004.jlab.org'}
[DEBUG ] LazyLoaded local_cache.prep_jid
[INFO ] Got return from farm10004.jlab.org for job 20160930143808286226
[DEBUG ] Sending event - data = {'tgt_type': 'glob', 'jid': '20160930143808286226', 'return': True, 'tgt': 'farm10004.jlab.org', 'schedule': '__mine_interval', 'cmd': '_return', 'pid': 19203, '_stamp': '2016-09-30T18:38:08.287483', 'arg
': [], 'fun': 'mine.update', 'id': 'farm10004.jlab.org'}
[DEBUG ] LazyLoaded local_cache.prep_jid
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:11.973532', 'minions': ['farm10004.jlab.org']}
[DEBUG ] Sending event - data = {'tgt_type': 'glob', 'jid': '20160930143811973252', 'tgt': 'farm10004*', '_stamp': '2016-09-30T18:38:11.973831', 'user': 'sudo_seitz', 'arg': ['software.mesa'], 'fun': 'state.sls', 'minions': ['farm10004.
jlab.org']}
[INFO ] User sudo_seitz Published command state.sls with jid 20160930143811973252
[DEBUG ] Published command details {'tgt_type': 'glob', 'jid': '20160930143811973252', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['software.mesa'], 'fun': 'state.sls'}
[DEBUG ] LazyLoaded roots.envs
[DEBUG ] LazyLoaded local_cache.prep_jid
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:16.982424', 'minions': ['farm10004.jlab.org']}
[DEBUG ] Sending event - data = {'tgt_type': 'glob', 'jid': '20160930143816982144', 'tgt': 'farm10004*', '_stamp': '2016-09-30T18:38:16.982735', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job', 'minions
': ['farm10004.jlab.org']}
[INFO ] User sudo_seitz Published command saltutil.find_job with jid 20160930143816982144
[DEBUG ] Published command details {'tgt_type': 'glob', 'jid': '20160930143816982144', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
[DEBUG ] LazyLoaded local_cache.prep_jid
[INFO ] Got return from farm10004.jlab.org for job 20160930143816982144
[DEBUG ] Sending event - data = {'fun_args': ['20160930143811973252'], 'jid': '20160930143816982144', 'return': {'tgt_type': 'glob', 'jid': '20160930143811973252', 'tgt': 'farm10004*', 'pid': 19212, 'ret': '', 'user': 'sudo_seitz', 'arg
': ['software.mesa'], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'cmd': '_return', '_stamp': '2016-09-30T18:38:17.117202', 'fun': 'saltutil.find_job', 'id': 'farm10004.jlab.org'}
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:26.988799', 'minions': ['farm10004.jlab.org']}
[DEBUG ] Sending event - data = {'tgt_type': 'glob', 'jid': '20160930143826988511', 'tgt': 'farm10004*', '_stamp': '2016-09-30T18:38:26.989239', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job', 'minions
': ['farm10004.jlab.org']}
[INFO ] User sudo_seitz Published command saltutil.find_job with jid 20160930143826988511
[DEBUG ] Published command details {'tgt_type': 'glob', 'jid': '20160930143826988511', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
[INFO ] Got return from farm10004.jlab.org for job 20160930143826988511
[DEBUG ] Sending event - data = {'fun_args': ['20160930143811973252'], 'jid': '20160930143826988511', 'return': {'tgt_type': 'glob', 'jid': '20160930143811973252', 'tgt': 'farm10004*', 'pid': 19212, 'ret': '', 'user': 'sudo_seitz', 'arg
': ['software.mesa'], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'cmd': '_return', '_stamp': '2016-09-30T18:38:27.121532', 'fun': 'saltutil.find_job', 'id': 'farm10004.jlab.org'}
[DEBUG ] LazyLoaded local_cache.prep_jid
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:36.997605', 'minions': ['farm10004.jlab.org']}
[DEBUG ] Sending event - data = {'tgt_type': 'glob', 'jid': '20160930143836997315', 'tgt': 'farm10004*', '_stamp': '2016-09-30T18:38:36.997929', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job', 'minions
': ['farm10004.jlab.org']}
[INFO ] User sudo_seitz Published command saltutil.find_job with jid 20160930143836997315
[DEBUG ] Published command details {'tgt_type': 'glob', 'jid': '20160930143836997315', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
[INFO ] Got return from farm10004.jlab.org for job 20160930143836997315
[DEBUG ] Sending event - data = {'fun_args': ['20160930143811973252'], 'jid': '20160930143836997315', 'return': {'tgt_type': 'glob', 'jid': '20160930143811973252', 'tgt': 'farm10004*', 'pid': 19212, 'ret': '', 'user': 'sudo_seitz', 'arg
': ['software.mesa'], 'fun': 'state.sls'}, 'retcode': 0, 'success': True, 'cmd': '_return', '_stamp': '2016-09-30T18:38:37.130360', 'fun': 'saltutil.find_job', 'id': 'farm10004.jlab.org'}
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:47.014065', 'minions': ['farm10004.jlab.org']}
[DEBUG ] Sending event - data = {'tgt_type': 'glob', 'jid': '20160930143847013734', 'tgt': 'farm10004*', '_stamp': '2016-09-30T18:38:47.014500', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job', 'minions
': ['farm10004.jlab.org']}
[INFO ] User sudo_seitz Published command saltutil.find_job with jid 20160930143847013734
[DEBUG ] Published command details {'tgt_type': 'glob', 'jid': '20160930143847013734', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
...
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:42:12.267398', 'pretag': None, 'cmd': '_minion_event', 'tag': '_salt_error', 'data': {'message': 'The minion function caused an exception', 'args': ['The minion function caused a
n exception'], '_stamp': '2016-09-30T18:42:12.254144'}, 'id': 'farm10004.jlab.org'}
[ERROR ] Received minion error from [farm10004.jlab.org]: The minion function caused an exception
[DEBUG ] schedule.clean_proc_dir: checking job {'tgt_type': 'glob', 'jid': '20160930143601757136', 'tgt': 'farm10004*', 'pid': 19016, 'ret': '', 'user': 'sudo_seitz', 'arg': ['software.mesa'], 'fun': 'state.sls'} for process existence
[DEBUG ] LazyLoaded mine.update
[INFO ] Added mine.update to scheduler
[INFO ] Updating job settings for scheduled job: __mine_interval
[DEBUG ] SaltEvent PUB socket URI: ipc:///var/run/salt/minion/minion_event_ad027328e7_pub.ipc
[DEBUG ] SaltEvent PULL socket URI: ipc:///var/run/salt/minion/minion_event_ad027328e7_pull.ipc
[DEBUG ] Sending event - data = {'_stamp': '2016-09-30T18:38:06.688936', 'complete': True, 'schedule': {'__mine_interval': {'function': 'mine.update', 'jid_include': True, 'minutes': 60, 'maxrunning': 2}}}
[DEBUG ] Persisting schedule
[DEBUG ] Handling event '/salt/minion/minion_schedule_add_complete\n\n\x83\xa6_stamp\xba2016-09-30T18:38:06.688936\xa8complete\xc3\xa8schedule\x81\xaf__mine_interval\x84\xa8function\xabmine.update\xabjid_include\xc3\xa7minutes<\xaamaxru
nning\x02'
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[INFO ] Minion is ready to receive requests!
[INFO ] Running scheduled job: __mine_interval
[DEBUG ] schedule: This job was scheduled with jid_include, adding to cache (jid_include defaults to True)
[DEBUG ] schedule: This job was scheduled with a max number of 2
[DEBUG ] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'jid': '20160930143807738858', 'pid': 19203, 'id': 'farm10004.jlab.org', 'schedule': '__mine_interval'}
[DEBUG ] MinionEvent PUB socket URI: ipc:///var/run/salt/minion/minion_event_ad027328e7_pub.ipc
[DEBUG ] MinionEvent PULL socket URI: ipc:///var/run/salt/minion/minion_event_ad027328e7_pull.ipc
[DEBUG ] Sending event - data = {'clear': False, 'cmd': '_mine', 'data': {}, 'id': 'farm10004.jlab.org', '_stamp': '2016-09-30T18:38:07.748542'}
[DEBUG ] Handling event '_minion_mine\n\n\x85\xa5clear\xc2\xa3cmd\xa5_mine\xa4data\x80\xa2id\xb2farm10004.jlab.org\xa6_stamp\xba2016-09-30T18:38:07.748542'
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[DEBUG ] schedule.handle_func: Removing /var/cache/salt/minion/proc/20160930143807738858
[INFO ] User sudo_seitz Executing command state.sls with jid 20160930143811973252
[DEBUG ] Command details {'tgt_type': 'glob', 'jid': '20160930143811973252', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['software.mesa'], 'fun': 'state.sls'}
[INFO ] Starting a new job with PID 19212
[DEBUG ] LazyLoaded state.sls
[DEBUG ] LazyLoaded saltutil.is_running
[DEBUG ] LazyLoaded grains.get
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[INFO ] User sudo_seitz Executing command saltutil.find_job with jid 20160930143816982144
[DEBUG ] Command details {'tgt_type': 'glob', 'jid': '20160930143816982144', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
[INFO ] Starting a new job with PID 19217
[DEBUG ] LazyLoaded saltutil.find_job
[DEBUG ] Minion return retry timer set to 7 seconds (randomized)
[INFO ] Returning information for job: 20160930143816982144
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[INFO ] User sudo_seitz Executing command saltutil.find_job with jid 20160930143826988511
[DEBUG ] Command details {'tgt_type': 'glob', 'jid': '20160930143826988511', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
[INFO ] Starting a new job with PID 19223
[DEBUG ] LazyLoaded saltutil.find_job
[DEBUG ] Minion return retry timer set to 9 seconds (randomized)
[INFO ] Returning information for job: 20160930143826988511
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[INFO ] User sudo_seitz Executing command saltutil.find_job with jid 20160930143836997315
[DEBUG ] Command details {'tgt_type': 'glob', 'jid': '20160930143836997315', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
[INFO ] Starting a new job with PID 19231
[DEBUG ] LazyLoaded saltutil.find_job
[DEBUG ] Minion return retry timer set to 7 seconds (randomized)
[INFO ] Returning information for job: 20160930143836997315
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[INFO ] User sudo_seitz Executing command saltutil.find_job with jid 20160930143847013734
[DEBUG ] Command details {'tgt_type': 'glob', 'jid': '20160930143847013734', 'tgt': 'farm10004*', 'ret': '', 'user': 'sudo_seitz', 'arg': ['20160930143811973252'], 'fun': 'saltutil.find_job'}
[INFO ] Starting a new job with PID 19238
[DEBUG ] LazyLoaded saltutil.find_job
[DEBUG ] Minion return retry timer set to 6 seconds (randomized)
[INFO ] Returning information for job: 20160930143847013734
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[DEBUG ] SaltReqTimeoutError, retrying. (1/3)
[DEBUG ] SaltReqTimeoutError, retrying. (2/3)
[DEBUG ] SaltReqTimeoutError, retrying. (3/3)
[WARNING ] The minion failed to return the job information for job 20160930143847013734. This is often due to the master being shut down or overloaded. If the master is running consider increasing the worker_threads value.
[DEBUG ] SaltReqTimeoutError, retrying. (1/3)
[DEBUG ] SaltReqTimeoutError, retrying. (2/3)
[DEBUG ] SaltReqTimeoutError, retrying. (3/3)
[WARNING ] The minion function caused an exception
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/salt/minion.py", line 1039, in _thread_return
return_data = func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/salt/modules/state.py", line 699, in sls
st_ = salt.state.HighState(opts, pillar, kwargs.get('__pub_jid'), proxy=__proxy__)
File "/usr/lib/python2.6/site-packages/salt/state.py", line 3169, in __init__
BaseHighState.__init__(self, opts)
File "/usr/lib/python2.6/site-packages/salt/state.py", line 2264, in __init__
self.avail = self.__gather_avail()
File "/usr/lib/python2.6/site-packages/salt/state.py", line 2273, in __gather_avail
for saltenv in self._get_envs():
File "/usr/lib/python2.6/site-packages/salt/state.py", line 2329, in _get_envs
client_envs = self.client.envs()
File "/usr/lib/python2.6/site-packages/salt/fileclient.py", line 1249, in envs
return self.channel.send(load)
File "/usr/lib/python2.6/site-packages/salt/utils/async.py", line 73, in wrap
ret = self._block_future(ret)
File "/usr/lib/python2.6/site-packages/salt/utils/async.py", line 83, in _block_future
return future.result()
File "/usr/lib64/python2.6/site-packages/tornado/concurrent.py", line 214, in result
raise_exc_info(self._exc_info)
File "/usr/lib64/python2.6/site-packages/tornado/gen.py", line 876, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python2.6/site-packages/salt/transport/zeromq.py", line 231, in send
ret = yield self._crypted_transfer(load, tries=tries, timeout=timeout)
File "/usr/lib64/python2.6/site-packages/tornado/gen.py", line 870, in run
value = future.result()
File "/usr/lib64/python2.6/site-packages/tornado/concurrent.py", line 214, in result
raise_exc_info(self._exc_info)
File "/usr/lib64/python2.6/site-packages/tornado/gen.py", line 876, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python2.6/site-packages/salt/transport/zeromq.py", line 199, in _crypted_transfer
ret = yield _do_transfer()
File "/usr/lib64/python2.6/site-packages/tornado/gen.py", line 870, in run
value = future.result()
File "/usr/lib64/python2.6/site-packages/tornado/concurrent.py", line 214, in result
raise_exc_info(self._exc_info)
File "/usr/lib64/python2.6/site-packages/tornado/gen.py", line 876, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python2.6/site-packages/salt/transport/zeromq.py", line 185, in _do_transfer
tries=tries,
File "/usr/lib64/python2.6/site-packages/tornado/gen.py", line 870, in run
File "/usr/lib64/python2.6/site-packages/tornado/gen.py", line 870, in run
value = future.result()
File "/usr/lib64/python2.6/site-packages/tornado/concurrent.py", line 214, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
SaltReqTimeoutError: Message timed out
[DEBUG ] SaltEvent PUB socket URI: ipc:///var/run/salt/minion/minion_event_ad027328e7_pub.ipc
[DEBUG ] SaltEvent PULL socket URI: ipc:///var/run/salt/minion/minion_event_ad027328e7_pull.ipc
[DEBUG ] Sending event - data = {'message': u'The minion function caused an exception', 'args': ('The minion function caused an exception',), '_stamp': '2016-09-30T18:42:12.254144'}
[DEBUG ] Handling event "_salt_error\n\n\x83\xa7message\xda\x00'The minion function caused an exception\xa4args\x91\xda\x00'The minion function caused an exception\xa6_stamp\xba2016-09-30T18:42:12.254144"
[DEBUG ] Forwarding salt error event tag=_salt_error
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[DEBUG ] Minion return retry timer set to 5 seconds (randomized)
[INFO ] Returning information for job: 20160930143811973252
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506', 'aes')
[DEBUG ] Initializing new SAuth for ('/etc/salt/pki/minion', 'farm10004.jlab.org', 'tcp://129.57.100.19:4506')
[DEBUG ] SaltReqTimeoutError, retrying. (1/3)
[DEBUG ] SaltReqTimeoutError, retrying. (2/3)
[DEBUG ] SaltReqTimeoutError, retrying. (3/3)
[WARNING ] The minion failed to return the job information for job 20160930143811973252. This is often due to the master being shut down or overloaded. If the master is running consider increasing the worker_threads value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment