Skip to content

Instantly share code, notes, and snippets.

@qdzlug
Created March 7, 2017 19:49
Show Gist options
  • Save qdzlug/dc46d394a3403f17a4b584e85b76672a to your computer and use it in GitHub Desktop.
Save qdzlug/dc46d394a3403f17a4b584e85b76672a to your computer and use it in GitHub Desktop.
Current state

SWSUP-665 has a description of the problem, basically:

  • OS-5950 fixed so we attempt to start metadata once we see the VM go 'running'
  • if qemu is up too slowly, it's possible we'll see 'running' before the socket is actually usable, in which case we'll get ECONNREFUSED
  • when we hit ECONNREFUSED on the initial connection, we rely on the periodic (every minute) retry
  • by the time we retry, Ubuntu zones with broken cloud-init (all of them, see IMAGE-1014) will get stuck and never properly recover

In order to work around this, we'll need to either:

  • have metadata retry after ECONNREFUSED, before the periodic timer
  • have some other mechanism that notifies metadata agent that the qemu socket is ready to be used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment