Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Full output from running tolerable ansible playbook
In this example, we will reboot the target machine, wait for it to come up, reboot it again and then finally create the file we are looking for.
Lines starting with >>>>> are indicators of what was happening to the machine, not Ansible output.
PLAY [Try to survive and detect a reboot] *******************************************************************
TASK [include_tasks] *******************************************************************
included: run_check_test.yml for 192.168.0.31
>>>>> BEGAN MACHINE REBOOT <<<<<
TASK [Check for the file] *******************************************************************
FAILED - RETRYING: Check for the file (2 retries left).
fatal: [192.168.0.31]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.0.31 port 22: Connection refused", "skip_reason": "Host 192.168.0.31 is unreachable", "unreachable": true}
TASK [Sleep if the host was unreachable] *******************************************************************
Pausing for 3 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [192.168.0.31 -> localhost]
TASK [set_fact] *******************************************************************
ok: [192.168.0.31]
TASK [include_tasks] *******************************************************************
included: run_check_test.yml for 192.168.0.31
TASK [Check for the file] *******************************************************************
fatal: [192.168.0.31]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.0.31 port 22: Operation timed out", "skip_reason": "Host 192.168.0.31 is unreachable", "unreachable": true}
TASK [Sleep if the host was unreachable] *******************************************************************
Pausing for 3 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [192.168.0.31 -> localhost]
TASK [set_fact] *******************************************************************
ok: [192.168.0.31]
TASK [include_tasks] *******************************************************************
included: run_check_test.yml for 192.168.0.31
TASK [Check for the file] *******************************************************************
fatal: [192.168.0.31]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.0.31 port 22: Operation timed out", "skip_reason": "Host 192.168.0.31 is unreachable", "unreachable": true}
TASK [Sleep if the host was unreachable] *******************************************************************
Pausing for 3 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [192.168.0.31 -> localhost]
TASK [set_fact] *******************************************************************
ok: [192.168.0.31]
TASK [include_tasks] *******************************************************************
included: run_check_test.yml for 192.168.0.31
>>>>> MACHINE HAS BOOTED <<<<<
TASK [Check for the file] *******************************************************************
FAILED - RETRYING: Check for the file (2 retries left).
FAILED - RETRYING: Check for the file (1 retries left).
>>>>> BEGIN MACHINE REBOOT <<<<<
fatal: [192.168.0.31]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.0.31 port 22: Connection refused", "skip_reason": "Host 192.168.0.31 is unreachable", "unreachable": true}
TASK [Sleep if the host was unreachable] *******************************************************************
Pausing for 3 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [192.168.0.31 -> localhost]
TASK [set_fact] *******************************************************************
ok: [192.168.0.31]
TASK [include_tasks] *******************************************************************
included: /Users/jowestco/GIT/github/john-westcott-iv/ansible-survive-disconnect/run_check_test.yml for 192.168.0.31
TASK [Check for the file] *******************************************************************
fatal: [192.168.0.31]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.0.31 port 22: Operation timed out", "skip_reason": "Host 192.168.0.31 is unreachable", "unreachable": true}
TASK [Sleep if the host was unreachable] *******************************************************************
Pausing for 3 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [192.168.0.31 -> localhost]
TASK [set_fact] *******************************************************************
ok: [192.168.0.31]
TASK [include_tasks] *******************************************************************
included: run_check_test.yml for 192.168.0.31
TASK [Check for the file] *******************************************************************
FAILED - RETRYING: Check for the file (2 retries left).
>>>>> CREATE FILE BEING MONITORED FOR <<<<<
ok: [192.168.0.31]
TASK [Sleep if the host was unreachable] *******************************************************************
skipping: [192.168.0.31]
TASK [set_fact] *******************************************************************
ok: [192.168.0.31]
TASK [include_tasks] *******************************************************************
skipping: [192.168.0.31]
TASK [Fail if we didn't get the file] *******************************************************************
skipping: [192.168.0.31]
TASK [Post task] *******************************************************************
ok: [192.168.0.31] => {
"msg": "This is the post task"
}
PLAY RECAP *******************************************************************
192.168.0.31 : ok=20 changed=0 unreachable=5 failed=0 skipped=8 rescued=0 ignored=0
@tman77

This comment has been minimized.

Copy link

tman77 commented Jun 18, 2020

@john-westcott-iv from your recent ansible post: https://www.ansible.com/blog/tolerable-ansible

What is controlling the decrement of the safety_counter. You set it to 5 in the run_check_test.yml, Im not seeing what controls decrementing it.

Wouldnt you need something like this in the main.yml
- set_fact: safety_counter: "5"

and then in the run_check_test.yml do something like:
- set_fact: safety_counter: "{{ safety_counter | int - 1 }}"

@Lorref

This comment has been minimized.

Copy link

Lorref commented Jun 19, 2020

@john-westcott-iv from your recent ansible post: https://www.ansible.com/blog/tolerable-ansible

What is controlling the decrement of the safety_counter. You set it to 5 in the run_check_test.yml, Im not seeing what controls decrementing it.

Wouldnt you need something like this in the main.yml
- set_fact: safety_counter: "5"

and then in the run_check_test.yml do something like:
- set_fact: safety_counter: "{{ safety_counter | int - 1 }}"

I was also thinking the same, I expected the safety_counter to be decremented somewhere.

@john-westcott-iv

This comment has been minimized.

Copy link
Owner Author

john-westcott-iv commented Jun 19, 2020

@tman77 @Lorref. Thanks for pointing that out, that step got mangled somewhere along the publishing lines. It should look like this:

# decrement a saftey counter so we don't end up in an infinite loop
- set_fact:
    saftey_counter: "{{ (saftey_counter | default(5) | int) - 1}}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.