re5et/gist:bc9d8e5847b5fd697d15

## gistfile1.txt
atom
so @keefe, the deal with the unicorn restart problem is this:

1) When restarting, unicorn takes the binary for the rails unicorn bin from $0

2) capistrano releases are changing this location

3) unicorn expects the binary it is using to stay in the same spot ($0)

so after: https://github.com/Bluescape/thoughtstream-chef/pull/278 and https://github.com/substantial/cookbook-rails_nginx_unicorn/pull/12

we are setting it to the shared bin stub at /var/www/thoughtstream-user_management/shared/vendor/bundle/ruby/2.1.0/bin/unicorn_rail

but you can only do that with the extremely hacky looking (but recommended by the author) Unicorn::HttpServer::START_CTX[0] = "/var/www/thoughtstream-user_management/shared/vendor/bundle/ruby/2.1.0/bin/unicorn_rails"

without forcing that, we were getting a unicorn rails bin path of something like: "/var/www/thoughtstream-user_management/releases/20150204232732/vendor/bundle/ruby/2.1.0/bin/unicorn_rails"

which stays in the running master, and is handed on to the next

keefe
oh geez

atom
so when capistrano eventually cleans up that release, it has nothing to restart

it throws

the old stuff keeps working

keefe
I thought we’d had another issue with these restarts and some path location that we’d fixed as well… so unicorn needs the old binary to be in the same spot to restart and change over to the new one?

atom
well that is no problem since we switched to a shared bundle

the bundle updates the shared bundle before the restart

so it always gets the latest

it does mean that a rollback needs a bundle install

it is what we used to do, but then we switched to the new deployment method, and probably re-caused the issue

you can see the duration of building new bundles here instead of using shared: https://teamcity.bluescape.com/viewType.html?buildTypeId=bt130&tab=buildTypeStatistics

actually look at quarter or year for duration

keefe
oh ok I understand the bin for unicorn itself was in a per build bundle, when the 5th deploy happened that bundle is gone so now we share them

atom
it is once per change to the Gemfile.lock version of unicorn

which has only happened once, which was for these changes

we actually shared them before, but unicorn was storing the expanded path of what it was executing, which included the release number

keefe
so even in the shared bundle when we update to the new version, it doesn’t have the old binary anymore so it fails to restart...

atom
it doesn't replace old one

they both stay, but the one that ends up in the vendor/bundle/ruby/2.1.0/bin/unicorn_rail is a wrapper to the latest bundled

this is where I found our problems: https://github.com/defunkt/unicorn/blob/master/lib/unicorn/http_server.rb#L53

keefe
sorry catching up on this ruby stuff… I’m confused about why after using a shared bundle the running binary was including the timestampped path, unless that only manifested once we switched to a shared bundle and the bug showed up when the non-shared bundle deploy which was running was cleaned up?

so this change https://github.com/Bluescape/thoughtstream-chef/pull/278/files will let us upgrade ruby without downtime right? otherwise that path stays constant in the unicorn master

11:11
so we’d switched from a nonshared bundle install to a shared bundle install so would a hard restart have resolved that bit?

atom
a hard restart (stop / start) should always work fine

the zero downtime reload / restart is where the problem lives

keefe
I meant, would one hard restart after switching to shared bundles permitted future zero downtime restarts to work?

atom
no

keefe
why not?

atom
because of bundler

when you do "bundle exec unicorn"

keefe
cause the shared bundle location stays the same right?

atom
it resolves the full path to where it thinks you should be getting the binary from

which will be relative to its pwd

inside of a release

which is numbered

so when unicorn first starts, it will store the location that bundle tells it

and it keeps that one

because of the $0

keefe
so it doesn’t resolve that symlink inside of /var/www/thoughtstream-user_management/releases/20150128175206/vendor to bundle -> /var/www/thoughtstream-user_management/shared/vendor/bundle

it just stores a hard  /var/www/thoughtstream-user_management/releases/20150128175206/vendor/bundle/ruby/2.1.0/etc

atom
deploy@acceptance:~$ cd /var/www/thoughtstream-user_management/current && bundle exec which unicorn
/var/www/thoughtstream-user_management/releases/20150205030455/vendor/bundle/ruby/2.1.0/bin/unicorn

this is a compound problem created by unicorn assumptions, capistrono rotation and bundler path expansion

keefe
got it so even though it’s shared and that resolves to the shared bundle location, it gets confused due to the path expansion - it doesn’t know that /var/www/thoughtstream-user_management/releases/20150205030455/vendor/bundle/ruby/2.1.0/bin/unicorn is the same as /var/www/thoughtstream-user_management/shared/vendor/bundle/ruby/2.1.0/bin/unicorn

so we’re telling it that explicitly

what a mess...

thanks for explanation

atom
yeah, we just make sure it knows exactly where the thing always is

keefe
it’s counter intuitive that we use a shared bundle and bundle exec doesn’t resolve to the real shared location on disk

atom
well the "shared" part of the bundle is not a bundler concept, that is just fancy symlinking during deploy

keefe
ah got it

atom
quite a mess.
	atom
	so @keefe, the deal with the unicorn restart problem is this:

	1) When restarting, unicorn takes the binary for the rails unicorn bin from $0

	2) capistrano releases are changing this location

	3) unicorn expects the binary it is using to stay in the same spot ($0)

	so after: https://github.com/Bluescape/thoughtstream-chef/pull/278 and https://github.com/substantial/cookbook-rails_nginx_unicorn/pull/12

	we are setting it to the shared bin stub at /var/www/thoughtstream-user_management/shared/vendor/bundle/ruby/2.1.0/bin/unicorn_rail

	but you can only do that with the extremely hacky looking (but recommended by the author) Unicorn::HttpServer::START_CTX[0] = "/var/www/thoughtstream-user_management/shared/vendor/bundle/ruby/2.1.0/bin/unicorn_rails"

	without forcing that, we were getting a unicorn rails bin path of something like: "/var/www/thoughtstream-user_management/releases/20150204232732/vendor/bundle/ruby/2.1.0/bin/unicorn_rails"

	which stays in the running master, and is handed on to the next

	keefe
	oh geez

	atom
	so when capistrano eventually cleans up that release, it has nothing to restart

	it throws

	the old stuff keeps working

	keefe
	I thought we’d had another issue with these restarts and some path location that we’d fixed as well… so unicorn needs the old binary to be in the same spot to restart and change over to the new one?

	atom
	well that is no problem since we switched to a shared bundle

	the bundle updates the shared bundle before the restart

	so it always gets the latest

	it does mean that a rollback needs a bundle install

	it is what we used to do, but then we switched to the new deployment method, and probably re-caused the issue

	you can see the duration of building new bundles here instead of using shared: https://teamcity.bluescape.com/viewType.html?buildTypeId=bt130&tab=buildTypeStatistics

	actually look at quarter or year for duration

	keefe
	oh ok I understand the bin for unicorn itself was in a per build bundle, when the 5th deploy happened that bundle is gone so now we share them

	atom
	it is once per change to the Gemfile.lock version of unicorn

	which has only happened once, which was for these changes

	we actually shared them before, but unicorn was storing the expanded path of what it was executing, which included the release number

	keefe
	so even in the shared bundle when we update to the new version, it doesn’t have the old binary anymore so it fails to restart...

	atom
	it doesn't replace old one

	they both stay, but the one that ends up in the vendor/bundle/ruby/2.1.0/bin/unicorn_rail is a wrapper to the latest bundled

	this is where I found our problems: https://github.com/defunkt/unicorn/blob/master/lib/unicorn/http_server.rb#L53

	keefe
	sorry catching up on this ruby stuff… I’m confused about why after using a shared bundle the running binary was including the timestampped path, unless that only manifested once we switched to a shared bundle and the bug showed up when the non-shared bundle deploy which was running was cleaned up?

	so this change https://github.com/Bluescape/thoughtstream-chef/pull/278/files will let us upgrade ruby without downtime right? otherwise that path stays constant in the unicorn master

	11:11
	so we’d switched from a nonshared bundle install to a shared bundle install so would a hard restart have resolved that bit?

	atom
	a hard restart (stop / start) should always work fine

	the zero downtime reload / restart is where the problem lives

	keefe
	I meant, would one hard restart after switching to shared bundles permitted future zero downtime restarts to work?

	atom
	no

	keefe
	why not?

	atom
	because of bundler

	when you do "bundle exec unicorn"

	keefe
	cause the shared bundle location stays the same right?

	atom
	it resolves the full path to where it thinks you should be getting the binary from

	which will be relative to its pwd

	inside of a release

	which is numbered

	so when unicorn first starts, it will store the location that bundle tells it

	and it keeps that one

	because of the $0

	keefe
	so it doesn’t resolve that symlink inside of /var/www/thoughtstream-user_management/releases/20150128175206/vendor to bundle -> /var/www/thoughtstream-user_management/shared/vendor/bundle

	it just stores a hard /var/www/thoughtstream-user_management/releases/20150128175206/vendor/bundle/ruby/2.1.0/etc

	atom
	deploy@acceptance:~$ cd /var/www/thoughtstream-user_management/current && bundle exec which unicorn
	/var/www/thoughtstream-user_management/releases/20150205030455/vendor/bundle/ruby/2.1.0/bin/unicorn

	this is a compound problem created by unicorn assumptions, capistrono rotation and bundler path expansion

	keefe
	got it so even though it’s shared and that resolves to the shared bundle location, it gets confused due to the path expansion - it doesn’t know that /var/www/thoughtstream-user_management/releases/20150205030455/vendor/bundle/ruby/2.1.0/bin/unicorn is the same as /var/www/thoughtstream-user_management/shared/vendor/bundle/ruby/2.1.0/bin/unicorn

	so we’re telling it that explicitly

	what a mess...

	thanks for explanation

	atom
	yeah, we just make sure it knows exactly where the thing always is

	keefe
	it’s counter intuitive that we use a shared bundle and bundle exec doesn’t resolve to the real shared location on disk

	atom
	well the "shared" part of the bundle is not a bundler concept, that is just fancy symlinking during deploy

	keefe
	ah got it

	atom
	quite a mess.