Skip to content

Instantly share code, notes, and snippets.

@miminar
Created January 25, 2018 12:35
Show Gist options
  • Save miminar/c46c82fb164ac7497fb67333efc88f47 to your computer and use it in GitHub Desktop.
Save miminar/c46c82fb164ac7497fb67333efc88f47 to your computer and use it in GitHub Desktop.
OpenShift Ansible troubleshooting

Too restrictive security group

Version: OCP 3.7, openshift-ansible-3.7.22-1-9-g56970a0 Environment: Amazon AWS cluster Scenario: 1 master&node + 2 nodes; glusterfs

Error

TASK [openshift_storage_glusterfs : Verify heketi service] **************************************************************************************************************************************************************
fatal: [ec2-54-237-234-66.compute-1.amazonaws.com]: FAILED! => {"changed": false, "cmd": ["oc", "rsh", "--namespace=default", "deploy-heketi-storage-1-vkfwm", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "4Jl50v50d9BtfTbxXI1LWeikIH9dfWGe559rZlq2Gz8=", "cluster", "list"], "delta": "0:02:07.485073", "end": "2018-01-25 05:26:51.447002", "msg": "non-zero return code", "rc": 1, "start": "2018-01-25 05:24:43.961929", "stderr": "Error from server: error dialing backend: dial tcp 172.18.12.254:10250: getsockopt: connection timed out", "stderr_lines": ["Error from server: error dialing backend: dial tcp 172.18.12.254:10250: getsockopt: connection timed out"], "stdout": "", "stdout_lines": []}
        to retry, use: --limit @/home/ec2-user/sapvora/openshift-ansible/playbooks/byo/config.retry

Proceeding

Found out that the master node cannot talk to kubelets at all:

# executed from the master node
$ curl -k https://172.18.12.254:10250/healthz
curl: (7) Failed connect to 172.18.12.254:10250; Connection timed out

Problem

All the EC2 instances used the same VPC network. The problem was in the security group (public-http) with allowed inbound traffic:

Type Protocol Port Range Source Description
SSH TCP 22 0.0.0.0/0
HTTP TCP 80 0.0.0.0/0
HTTPS TCP 443 0.0.0.0/0
Custom TCP Rule TCP 8000 0.0.0.0/0
Custom TCP Rule TCP 8443 0.0.0.0/0

Resolution

After adding default security group with all inbound traffic allowed, the nodes could talk to each other again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment