tknerr/disqussion

## disqussion

--------------------------------
Torben Knerr     • 19 days ago
Nice post! Finally some clarification on the different interpretations of what an application, library or wrapper cookbook is :-)

I have one question about the environment cookbooks though: what if you have two different applications (e.g. my_face and your_face) requiring a different set of dependencies running in the same environment?

IMHO locking dependency versions and exposing the public recipes / attributes via metadata.rb is a great thing, but it should be doable on a finer-grained basis than environments, namely on a per application (e.g. my_face) basis. For this exact reason I'm not using chef environments to lock the dependency versions, but rather something like an "uber application cookbook" which combines the properties of an application cookbook and the locking and documentation of an environment cookbook.

What do you think? And what would be a good name for it?


--------------------------------
Jamie Winsor     • 19 days ago
Thanks for the kind words Torben.

What you're describing is just an Environment Cookbook that is comprised of two other Application or Environment Cookbooks. I didn't describe that exact edge case here here because I can't identify anything that would compel me to put both applications in the single environment. It's just easier to place them in two different environments instead of making this meta Environment Cookbook.


--------------------------------
Torben Knerr     • 17 days ago
Hi Jamie,

I'm still a bit confused about the usage of Chef environments here. If you have one environment per application just for the purpose of locking the transitive cookbook dependency versions of your application that is a big smell IMHO.

Instead you should be able to have multiple applications (or rather the cookbooks deploying them) in the same environment, e.g.:

prod
- my_face 1.2
- your_face 2.0

test
- my_face 1.3
- your_face 2.1

In the example above you have the (chef) environments "prod" and "test" with different versions of the "my_face" and "your_face" cookbooks.

These can be represented 1:1 as Chef environments. The essential thing is that you only specify the version of the "top-level cookbook" (and that's probably a much better name than "uber application cookbook" ;-)) but you don't and you really should not have to care about the transitive dependencies here, as they are all locked in the "top-level cookbook's" metadata.rb and Berksfile.lock respectively.

I know there is a history of "abusing" Chef environments for locking _all_ the cookbook versions across the whole dependency graph. This is bad because 1) it's an implementation detail of the top-level cookbooks and 2) it forces / restricts you to always have a consistent set of dependencies (i.e. application / library / whatever cookbook versions) across _all_ your top-level cookbooks in that environment.

The latter might be a humble goal to strive for, but it's really not manageable as soon as you have a few top-level cookbooks within an environment. It artificially (and unnecessarily) couples their release cycles, which will eventually make you use one environment per application / top-level cookbook as you suggested, e.g. "prod-my_face", "prod-your_face", etc., but then you no longer have a single place for describing your "prod" environment anymore...

Rather than having a "berks apply <environment>" which in fact promotes this "abuse" of environments I would rather like to see something like "berks apply_metadata" which applies the Berksfile-locked dependencies to the metadata.rb of the top-level cookbook, so we just lock the version of the top-level cookbooks in our Chef environments and don't have to clutter them with locking the whole dependency graph anymore.

What do you think?

Cheers, Torben


--------------------------------
Jamie Winsor     • 17 days ago
I disagree that this is a "code smell". It's unfortunate that Chef Environments are called "Environments" because Operations folks have already reified the idea of what an "Environment" is. Think of a Chef Environment as an "Container" or a "Policy Group" instead of something that you would have called "Production" or "Staging". It's sole purpose is to house a group of nodes which share a collection of cookbooks in common. The purpose of this policy group can be reflected in the name (i.e. "myface-production").

As I mentioned, I originally didn't write about the idea of having an Environment cookbook including in other Environment cookbooks because I just found no reason to attempt to place two applications within the same Chef Environment. In a perfect world we would have an additional primitive, perhaps called a "bundle", which would contain our Lockfile plus all of our cookbook dependencies and then this would be applied to our Chef Environment. However, since we're currently limited to the primitives that Chef Server provides us with, we'll have to settle on just calling it an Environment Cookbook.


--------------------------------
Torben Knerr     • 17 days ago
Hi Jamie,

Thanks for the clarification, that sheds quite a different light on it.
I'm still in doubt that this reification of environments is a good thing, but maybe it just feels unnatural to me.

What I'm wondering then is which chef primitives operations folks then use for representing "prod" and "staging"?

What do you use?

Cheers, Torben


--------------------------------
Jamie Winsor     • 17 days ago
You use the same primitives with a slightly different name than yours might currently be named. Instead of having a Chef Environment named "production", name it "myface-production". You may have a collection of these Chef Environments that represent everything that is in your production environment.


--------------------------------
Torben Knerr     • 15 days ago
Yeah I thought so too at first, but then you end up having to duplicate all the things that are specific to an environment but common to all nodes in that environment (e.g. the hostname of the "production" mail relay)

Or can a node be in two different environments at once, like "prod-myface" + "prod-common"??

Using a "top-level cookbook" let's you use environments for describing environments as it uses metadata.rb for locking all the transitive dependencies.

IMHO that's the cleaner approach and I don't really see the benefit in the current environments based approach. In fact I was discussing this with other people as well, but the best answer I could get was that "this is historically grown".

:-/


--------------------------------
Jamie Winsor     • 15 days ago
Hey Torben,

I understand your concern and there are three solutions, two of which might not have been super clear.

All of your cookbooks should have sensible defaults targeted at your most customer facing environment. Your mail settings would probably be configured in your "Base Cookbook" which should also be a dependency of your Environment Cookbook. With this approach you would still need to ensure that you configure your mail settings properly in all development, testing, or staging environments.

Global configuration data can, and should, be stored within a data bag. This is something that you may want to provide to nodes across your entire organization throughout (N) number of environments. Your "Base Cookbook" would do the job of reading this in and configuring the node. You would also set the Base Cookbook as a dependency of your Environment Cookbook.

The last option is one that I've already described. Create an Environment Cookbook that includes your other Environment Cookbooks. This is what you are calling "a top level cookbook", but it's simply just another Environment Cookbook. It doesn't even need to have recipes, it just needs to be a version-able artifact with a lockfile.

There are ongoing discussions among Opscode members, myself, and Fletcher regarding improving or creating new Chef primitives to match these cookbook patterns. A bit of chat regarding the Environment Cookbook pattern resulted in the idea of a "Chef Bundle" which contains exactly the same things, it just has a name and a file extension to make it a "real thing". It's too early to tell what the final outcome of these discussions and dev will be, but we all agree that improving primitives to support these patterns is a good thing.


--------------------------------
Torben Knerr     • 13 days ago
Hey Jamie,

good to hear that discussion is going on. Yet I think we are not on the same page yet:

The mail settings were just an example for an attribute which is specific to an environment like "prod", "test" or "staging". It's something that I want to be configurable per environment but not something that is static and can be hardcoded to base cookbook via node.set. And yes, it might be interpreted by the base cookbook, but probably by any other application cookbook in the dependency graph as well.

Data bags are clear to me as well, and global data is the keyword. A perfect example is users databag from the users cookbook, which manages a global list of users / ssh keys in a databag, and then uses an attribute to pick from this list which users should be added to specific nodes.

I also see the common pattern of putting "environment keys" in databags, which might make sense in some cases (e.g. where encryption is needed) but not in others (e.g. if you want to use ruby primitives which you can't in databags because it's just json).

When talking about "top-level" cookbooks, I mean top level per application to be deployed on one (or possibly across a cluster of) nodes. I believe this aligns pretty much with what you describe as an environment cookbook, since the responsibilities are the same (right?):
1) lock the cookbook versions of the whole dependency graph as per Berksfile.lock
2) expose the publicly configurable attributes via documentation

The only difference is that I propose to use a different mechanism for locking the dependency versions, namely using metadata.rb in favor of environments.

So far I can not see any disadvantage of locking the dependency versions in metadata.rb. The advantage is that it frees up environments again for defining environment specific attributes, which you would not have a chef primitive for (anymore) otherwise.

Are we finally talking about the same responsibilities but just a different representation? :-)

Example.: consider the scenario of promoting the latest version of the "my_face" and dependent cookbooks from "staging" to "prod":

When using metadata.rb for representing the Berksfile.lock'ed dependencies, there is only a single bit you have to change in the "prod" environment: the version of the my_face cookbook.

When using a my_face-specific environment file, you will change all the updated dependencie's cookbook versions as well.

Plus: you can represent your production environment in a single place (environments/prod.rb) rather than having it split across multiple places (environments/prod-*.rb).

The latter comes with the additional caveat that you can not share the common attributes within an environment which finally results in violating the DRY principle.


--------------------------------
Torben Knerr     • 13 days ago
EDIT: oops - reply be email messed up the ordering of replies. see below for the full answer


--------------------------------
Jamie Winsor     • 13 days ago
Just using the metadata for dependencies would be a cumbersome task and I wouldn't recommend it. You would also need to list all of your dependencies, dependencies, and their dependencies (and so on) in your metadata.

Some combination of an Environment Cookbook and data bags will do the trick if you're concerned about representing your "production" configuration settings in multiple places.


--------------------------------
Torben Knerr     • 12 days ago
It's exactly as cumbersome as listing all of your dependencies and their transitive dependencies in your environment. In both ways you have to do it, just in two distinct places.

The only difference is that berkshelf explicitly supports and propagates the environment approach by providing `berks apply <environment>` but not `berks apply_metadata`, which is totally understandable because your are not using the metadata approach.

On the other hand -- would you be open for accepting a PR on this?

I'm not having too much time lately (sorry for the long delays between posts btw) but this might eventually change a bit in the next months, and if so the time should be well invested :-)


--------------------------------
Jamie Winsor     • 12 days ago
No, a PR would not be accepted that manipulates the metadata of a cookbook. Berkshelf shouldn't modify your cookbooks in anyway (aside from initial generation).

I am confident that the environment + Lockfile approach is the best way to go about this in the current version of Chef. We'll need to look to the future as new Chef primitives are added for a cleaner approach. For now I think some combination of an Environment Cookbook and Data Bags is the right approach to drying up your production configurations.

	--------------------------------
	Torben Knerr • 19 days ago
	Nice post! Finally some clarification on the different interpretations of what an application, library or wrapper cookbook is :-)

	I have one question about the environment cookbooks though: what if you have two different applications (e.g. my_face and your_face) requiring a different set of dependencies running in the same environment?

	IMHO locking dependency versions and exposing the public recipes / attributes via metadata.rb is a great thing, but it should be doable on a finer-grained basis than environments, namely on a per application (e.g. my_face) basis. For this exact reason I'm not using chef environments to lock the dependency versions, but rather something like an "uber application cookbook" which combines the properties of an application cookbook and the locking and documentation of an environment cookbook.

	What do you think? And what would be a good name for it?



	--------------------------------
	Jamie Winsor • 19 days ago
	Thanks for the kind words Torben.

	What you're describing is just an Environment Cookbook that is comprised of two other Application or Environment Cookbooks. I didn't describe that exact edge case here here because I can't identify anything that would compel me to put both applications in the single environment. It's just easier to place them in two different environments instead of making this meta Environment Cookbook.



	--------------------------------
	Torben Knerr • 17 days ago
	Hi Jamie,

	I'm still a bit confused about the usage of Chef environments here. If you have one environment per application just for the purpose of locking the transitive cookbook dependency versions of your application that is a big smell IMHO.

	Instead you should be able to have multiple applications (or rather the cookbooks deploying them) in the same environment, e.g.:

	prod
	- my_face 1.2
	- your_face 2.0

	test
	- my_face 1.3
	- your_face 2.1

	In the example above you have the (chef) environments "prod" and "test" with different versions of the "my_face" and "your_face" cookbooks.

	These can be represented 1:1 as Chef environments. The essential thing is that you only specify the version of the "top-level cookbook" (and that's probably a much better name than "uber application cookbook" ;-)) but you don't and you really should not have to care about the transitive dependencies here, as they are all locked in the "top-level cookbook's" metadata.rb and Berksfile.lock respectively.

	I know there is a history of "abusing" Chef environments for locking _all_ the cookbook versions across the whole dependency graph. This is bad because 1) it's an implementation detail of the top-level cookbooks and 2) it forces / restricts you to always have a consistent set of dependencies (i.e. application / library / whatever cookbook versions) across _all_ your top-level cookbooks in that environment.

	The latter might be a humble goal to strive for, but it's really not manageable as soon as you have a few top-level cookbooks within an environment. It artificially (and unnecessarily) couples their release cycles, which will eventually make you use one environment per application / top-level cookbook as you suggested, e.g. "prod-my_face", "prod-your_face", etc., but then you no longer have a single place for describing your "prod" environment anymore...

	Rather than having a "berks apply <environment>" which in fact promotes this "abuse" of environments I would rather like to see something like "berks apply_metadata" which applies the Berksfile-locked dependencies to the metadata.rb of the top-level cookbook, so we just lock the version of the top-level cookbooks in our Chef environments and don't have to clutter them with locking the whole dependency graph anymore.

	What do you think?

	Cheers, Torben



	--------------------------------
	Jamie Winsor • 17 days ago
	I disagree that this is a "code smell". It's unfortunate that Chef Environments are called "Environments" because Operations folks have already reified the idea of what an "Environment" is. Think of a Chef Environment as an "Container" or a "Policy Group" instead of something that you would have called "Production" or "Staging". It's sole purpose is to house a group of nodes which share a collection of cookbooks in common. The purpose of this policy group can be reflected in the name (i.e. "myface-production").

	As I mentioned, I originally didn't write about the idea of having an Environment cookbook including in other Environment cookbooks because I just found no reason to attempt to place two applications within the same Chef Environment. In a perfect world we would have an additional primitive, perhaps called a "bundle", which would contain our Lockfile plus all of our cookbook dependencies and then this would be applied to our Chef Environment. However, since we're currently limited to the primitives that Chef Server provides us with, we'll have to settle on just calling it an Environment Cookbook.



	--------------------------------
	Torben Knerr • 17 days ago
	Hi Jamie,

	Thanks for the clarification, that sheds quite a different light on it.
	I'm still in doubt that this reification of environments is a good thing, but maybe it just feels unnatural to me.

	What I'm wondering then is which chef primitives operations folks then use for representing "prod" and "staging"?

	What do you use?

	Cheers, Torben



	--------------------------------
	Jamie Winsor • 17 days ago
	You use the same primitives with a slightly different name than yours might currently be named. Instead of having a Chef Environment named "production", name it "myface-production". You may have a collection of these Chef Environments that represent everything that is in your production environment.



	--------------------------------
	Torben Knerr • 15 days ago
	Yeah I thought so too at first, but then you end up having to duplicate all the things that are specific to an environment but common to all nodes in that environment (e.g. the hostname of the "production" mail relay)

	Or can a node be in two different environments at once, like "prod-myface" + "prod-common"??

	Using a "top-level cookbook" let's you use environments for describing environments as it uses metadata.rb for locking all the transitive dependencies.

	IMHO that's the cleaner approach and I don't really see the benefit in the current environments based approach. In fact I was discussing this with other people as well, but the best answer I could get was that "this is historically grown".

	:-/



	--------------------------------
	Jamie Winsor • 15 days ago
	Hey Torben,

	I understand your concern and there are three solutions, two of which might not have been super clear.

	All of your cookbooks should have sensible defaults targeted at your most customer facing environment. Your mail settings would probably be configured in your "Base Cookbook" which should also be a dependency of your Environment Cookbook. With this approach you would still need to ensure that you configure your mail settings properly in all development, testing, or staging environments.

	Global configuration data can, and should, be stored within a data bag. This is something that you may want to provide to nodes across your entire organization throughout (N) number of environments. Your "Base Cookbook" would do the job of reading this in and configuring the node. You would also set the Base Cookbook as a dependency of your Environment Cookbook.

	The last option is one that I've already described. Create an Environment Cookbook that includes your other Environment Cookbooks. This is what you are calling "a top level cookbook", but it's simply just another Environment Cookbook. It doesn't even need to have recipes, it just needs to be a version-able artifact with a lockfile.

	There are ongoing discussions among Opscode members, myself, and Fletcher regarding improving or creating new Chef primitives to match these cookbook patterns. A bit of chat regarding the Environment Cookbook pattern resulted in the idea of a "Chef Bundle" which contains exactly the same things, it just has a name and a file extension to make it a "real thing". It's too early to tell what the final outcome of these discussions and dev will be, but we all agree that improving primitives to support these patterns is a good thing.



	--------------------------------
	Torben Knerr • 13 days ago
	Hey Jamie,

	good to hear that discussion is going on. Yet I think we are not on the same page yet:

	The mail settings were just an example for an attribute which is specific to an environment like "prod", "test" or "staging". It's something that I want to be configurable per environment but not something that is static and can be hardcoded to base cookbook via node.set. And yes, it might be interpreted by the base cookbook, but probably by any other application cookbook in the dependency graph as well.

	Data bags are clear to me as well, and global data is the keyword. A perfect example is users databag from the users cookbook, which manages a global list of users / ssh keys in a databag, and then uses an attribute to pick from this list which users should be added to specific nodes.

	I also see the common pattern of putting "environment keys" in databags, which might make sense in some cases (e.g. where encryption is needed) but not in others (e.g. if you want to use ruby primitives which you can't in databags because it's just json).

	When talking about "top-level" cookbooks, I mean top level per application to be deployed on one (or possibly across a cluster of) nodes. I believe this aligns pretty much with what you describe as an environment cookbook, since the responsibilities are the same (right?):
	1) lock the cookbook versions of the whole dependency graph as per Berksfile.lock
	2) expose the publicly configurable attributes via documentation

	The only difference is that I propose to use a different mechanism for locking the dependency versions, namely using metadata.rb in favor of environments.

	So far I can not see any disadvantage of locking the dependency versions in metadata.rb. The advantage is that it frees up environments again for defining environment specific attributes, which you would not have a chef primitive for (anymore) otherwise.

	Are we finally talking about the same responsibilities but just a different representation? :-)

	Example.: consider the scenario of promoting the latest version of the "my_face" and dependent cookbooks from "staging" to "prod":

	When using metadata.rb for representing the Berksfile.lock'ed dependencies, there is only a single bit you have to change in the "prod" environment: the version of the my_face cookbook.

	When using a my_face-specific environment file, you will change all the updated dependencie's cookbook versions as well.

	Plus: you can represent your production environment in a single place (environments/prod.rb) rather than having it split across multiple places (environments/prod-*.rb).

	The latter comes with the additional caveat that you can not share the common attributes within an environment which finally results in violating the DRY principle.



	--------------------------------
	Torben Knerr • 13 days ago
	EDIT: oops - reply be email messed up the ordering of replies. see below for the full answer



	--------------------------------
	Jamie Winsor • 13 days ago
	Just using the metadata for dependencies would be a cumbersome task and I wouldn't recommend it. You would also need to list all of your dependencies, dependencies, and their dependencies (and so on) in your metadata.

	Some combination of an Environment Cookbook and data bags will do the trick if you're concerned about representing your "production" configuration settings in multiple places.



	--------------------------------
	Torben Knerr • 12 days ago
	It's exactly as cumbersome as listing all of your dependencies and their transitive dependencies in your environment. In both ways you have to do it, just in two distinct places.

	The only difference is that berkshelf explicitly supports and propagates the environment approach by providing `berks apply <environment>` but not `berks apply_metadata`, which is totally understandable because your are not using the metadata approach.

	On the other hand -- would you be open for accepting a PR on this?

	I'm not having too much time lately (sorry for the long delays between posts btw) but this might eventually change a bit in the next months, and if so the time should be well invested :-)



	--------------------------------
	Jamie Winsor • 12 days ago
	No, a PR would not be accepted that manipulates the metadata of a cookbook. Berkshelf shouldn't modify your cookbooks in anyway (aside from initial generation).

	I am confident that the environment + Lockfile approach is the best way to go about this in the current version of Chef. We'll need to look to the future as new Chef primitives are added for a cleaner approach. For now I think some combination of an Environment Cookbook and Data Bags is the right approach to drying up your production configurations.