Skip to content

Instantly share code, notes, and snippets.

@datagrok
Last active November 3, 2023 17:37
Show Gist options
  • Star 41 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save datagrok/8577287 to your computer and use it in GitHub Desktop.
Save datagrok/8577287 to your computer and use it in GitHub Desktop.
"Vendoring" is a vile anti-pattern

"Vendoring" is a vile anti-pattern

What is "vendoring"?

From a comment on StackOverflow:

Vendoring is the moving of all 3rd party items such as plugins, gems and even rails into the /vendor directory. This is one method for ensuring that all files are deployed to the production server the same as the dev environment.

The activity described above, on its own, is fine. It merely describes the deployment location for various resources in an application.

However, many programmers have begun to commit the result of the above procedure to their own source code repositories.

That is to say, they copy all of the files at some version from one version control repository and paste them into a different version control repository.

You should have flinched reading the previous description. This practice is more vile than extending existing code by duplicating whole functions instead of using inheritance and abstraction, and for the same reasons.

Why is "vendoring" bad?

Extracting code from a version control repository to be archived in a compressed file or stored in a particular directory for deployment is not bad.

Extracting code from a version control repository to be re-comitted to a different version control repository is evil.

When you copy code between repositories:

  1. all history, branch, and tag information is lost
  2. pulling updates is impossible
  3. it invites modification, divergence, and unintentional forks
  4. it wastes space
  5. it is excruciatingly tedious to discover later which version of the code was the canonical source of the copied version, unless the person doing it went out of their way to document that information

What you should do instead

Use git submodules

Git's submodule mechanism stores a URL and a commit hash in your repository, which itself is under version control.

(TODO: explain more, examples of work-alikes from different VCSs)

Use an approximation of git submodules

If you can't use git submodules, use a script that deploys your third-party resources at the appropriate time, from a canonical source.

(TODO: describe more, provide examples)

None of those suggestions work for me

If you have a situation where it seems like "vendoring" is really the best way to deploy your code, contact me, call me an idiot, and describe why. I'm sure there's a better way, and where there's not, it's a bug. I hope to eventually document all such situations to prevent people from falling into this bad habit.

Guilty parties

@kingdonb
Copy link

I want to thank you for writing this, I wrote a ham-handed proposal for vendoring in our current framework to help enforce a "build-release-run" separation, like you get on 12-factor platforms, and I knew it was a bad idea to cobble this onto our existing legacy bespoke deployment infrastructure, but I couldn't quite bring myself to spend any time articulating why (so I almost didn't even share the idea.)

But then I plugged into the Google and came here, to see all of the reasons I subconsciously already knew that my idea was bad, but neatly articulated and so, I commend you, saved me the time of debunking my own straw-man. Vendoring your gems is not bad. Vendoring your gems in the project repo, definitely bad. That vendor/cache should be in gitignore. What I was proposing was meant to be an example of "how we could improve things slightly, without code changes to our deployment systems, simply by asking developers to change their habits a little bit – but WAIT! Don't actually do this, it's a totally half-baked idea and there are many ways it can go wrong, the quick wins and illusion of forward progress is a tempting oasis, and it isn't real."

We're hoping to adopt a Platform of some kind off-the-shelf and I was intending that people would see, while yes, we could address each problem with our current system, one by one, many have already done this and we should learn from their experience before we repeat those mistakes. We could boil the ocean, or we could pick something off-the-shelf like we planned to do in our project charter.

@crd
Copy link

crd commented Aug 26, 2019

I can't blame @capnslipp too much for the combative attitude and casting aspersions on me, since I set the mood with that antagonistic title and strongly-worded assertions. I regret the tone I used. This whole thing probably could have been: "don't do vendoring by-hand; use a dependency-tracking tool instead, and don't touch the vendor folder."

Just here to say that I appreciated how well @datagrok handled the criticism -- nice to see.

@LambertGreen
Copy link

Thanks for not deleting this. I am investigating options in this space, and it is all new to me, and so it was great to find this post and the discussions.

@sparr
Copy link

sparr commented May 31, 2023

History can be preserved in a few different ways, most of which are given as answers here https://stackoverflow.com/questions/1365541/how-to-move-some-files-from-one-git-repo-to-another-not-a-clone-preserving-hi

@JamesTheAwesomeDude
Copy link

I'd love to use git submodule for this, except none of the major git server software actually mirrors the pinned commit. If the linked remote goes down or erases the commit I depended on for whatever reason, that'll break clone --recurse on my own repo, which is very cringe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment