I'm doing some research on how companies use GitHub Enterprise (or public GitHub) internally. If you can help out by answering a few questions, I'd greatly appreciate it.
- What is the primary setup? Is there an organization and each official repo is owned by that organization?
- Does every engineer have a fork of each repo they're working on?
- Are engineers allowed to push directly to the official repo? Or must all commits go through a pull request?
- Do engineers work on feature branches on the main repo or on their own forks?
- Do you require engineers to squash commits and rebase before merging?
- Overall, what is the workflow for getting a new commit into the main repository?
- What sort of hooks do you make use of?
- Are there any ops issues you encountered? (Scaling, unforeseen downtime, etc.)
- Anything else worth noting?
Thanks very much for your feedback. I plan on coordinating all information into a blog post so we can all benefit from understanding these workflows.
So for the longest time Yammer was using public github with an org for private repositories. We recently setup GHE internally and we're in the process of migrating many repos there. There are probably several reasons, but a big one is that it became difficult to manage some things that needed to be secret. We needed to be more careful about what things ended up on servers that we don't own.
We have repositories all over the place. All production repos are under the yammer org, either on github.com or GHE. We have internal projects, personal projects on people's private accounts, a "third party" account where we maintain forks of public open source stuff. I don't think it would buy us much to have a super strict system. It all boils down to a url and the right access.
Engineers can fork if they want. But historically we've been fine having a single repo and having project teams use feature branches. This has been fine even as we've scaled up to our current size. The only real downside is a proliferation of old feature branches. But that's sometimes good for historical context. For other reasons, engineers at yammer are exploring using a pull request model right now. More on that below.
Yes, all engineers are empowered and encouraged to push to master. We trust engineers and we value engineering velocity. It doesn't mean people's code doesn't get checked. Communication within project teams is very high. But it's an informal process rather than a thing that limits velocity.
There is a little more of a barrier if you are working in a codebase outside of your usual area (though still informal). For instance, as a front-end engineer, I may find myself working one of our back-end JVM services. I'd know better than to ship something to master there because I'm not as familiar with things. So I'd make sure my feature branch got reviewed by an engineer from our core services team. Once that person have the thumbs up, I could merge that branch into their master.
One big topic right now is introducing a better code review workflow at yammer. As I said, right now this happens in an ad hoc way, and some people want to streamline how things are done to make it faster and more convenient. We have people exploring the pull request workflow. But we still have an aversion to putting in limiting gates. So we would be resistant to a process where you couldn't ship unless someone accepted your pull request. Instead a PR is just a convenient way of packaging up a related set of commits so you can have someone else review them. Some people might instead used squashed commits or just point reviewers at a feature branch on github. The actual process is less important than being able to use the nice github interface for commenting on and discussing code.
The biggest annoyance with this is that yammer folks like all discussions to be in yammer. But now we have interesting conversations that happen in github because the context is better. I'm hoping some kind of integration makes that story better soon. Yammer should be in everything ;)
I've officially bowed out of the "what's the best git workflow" debate. But I'm fine talking through pros and cons of various approaches. I've tried several at this point. You should pick the one that suits your team the best or supports your engineering goals the best. Or just let people sort it out on their own.
If you're on support, you probably own the changes and so you just merge/squash into master. If you're on a project team, then you coordinate with the team for when those changes should ship. You may have a long running feature branch. You may have a separate project branch where engineers on the project team merge and integrate their various changes. Once the project changes are tested and QAed locally as much needed, then merge/squash to master. Master is treated as "stable and shippable". Once you push to master, you should expect that those changes could be in production without you knowing it.
The important thing here is that engineers shepherd their changes through the system by making sure all interested parties stay informed. Product managers and QA know when things are shipping in the next release. Other teams get a heads up when major changes are pushed into a codebase. We try to substitute good communication for strict process here. It's worked pretty well so far.
Not sure. The front-end team doesn't use any right now. But we're considering pre-commit hooks for linting.
With github? I'm sure there are stories here that I'm not familiar with. I can try to get info from others at Yammer. But I know it pays to decouple deployments from the network. Pull down local branches before deploy time. Make tarballs that get distributed by scripts. Relying on the network (even your own internal one), for a deploy going smoothly will probably bite you at some point. It's a tradeoff between convenience and reliability.
The main thing I want to leave you with here is to think hard about what putting strict processes in place is supposed to buy you. Only use them when they pay the bills. Otherwise just let people use their judgment. And if people have bad judgment, that's not a problem that can be fixed with process.
Here's an example. For most of my tenure on the front-end team at yammer, we used the git-flow model. It worked well for us as a team because we had lots of people making changes and we wanted low friction to pushing those out. But we also had a weekly deploy process that meant we couldn't just ship to master whenever we wanted. So git flow's split between
develop
andmaster
branches fit nicely. No other team at yammer used the git flow model. We got teased a bit, and we had to educate people on how to interact with our codebase. But it worked for us.But today yammer is continuing to move towards a continuous deployment model. We do releases more often and we've refactored our front-end codebase structure so that it's more manageable. After taking stock of all of these things, we realized that the git flow model might be holding us back at this point. We want to ship to master more quickly so changes can be picked up by the very latest release. And managing the gap between
develop
andmaster
was becoming a pain rather than being helpful. Git flow had gone from a useful process that helped us maintain speed despite a slower release process, to a hinderance that kept us behind as our release process caught up. The team talked about it and agreed. So we ditched it. We now have release branches and master. And everybody's happy.I hope all of this is helpful. Having an excuse to write this all down was cool. I can provide more details if you're confused or curious about anything. I'm sure there are gaps here that we could be doing much better. But this stuff is on a continuous improvement curve at yammer.