Skip to content

Instantly share code, notes, and snippets.

@StevenACoffman
Last active July 17, 2023 03:23
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save StevenACoffman/cbb8e95960d2daac2077b0f163f99929 to your computer and use it in GitHub Desktop.
Save StevenACoffman/cbb8e95960d2daac2077b0f163f99929 to your computer and use it in GitHub Desktop.
PhabricatorVsGithub.md

What's the diff between phabricator and github?

Phabricator’s and github’s usage of git is very different. With github you are always pushing new commits, but with phabricator you should never push (except when updating a deploy branch from master). You are better off pretending that we don't actually use git, and that the Khan Academy recommended Phabricator workflow is a series of inscrutable magic incantations that must be meticulously performed or you'll release Ẕ̶̨̫̹̌͊͌͑͊̕͢͟a̡̜̦̝͓͇͗̉̆̂͋̏͗̍ͅl̡̛̝͍̅͆̎̊̇̕͜͢ģ̧̧͍͓̜̲͖̹̂͋̆̃̑͗̋͌̊̏ͅǫ̷̧͓̣͚̞̣̋̂̑̊̂̀̿̀̚͟͠ͅ.

If you are still irrationally stubborn, like I am, and really want to continue using the familiar and universal-outside-khan git workflow, then since you will be fighting the Khan tools you need to be firm and unambiguous in communicating to phabricator what you want it to do.

My workflow is always:

  1. Make a branch.
  2. Make some changes
  3. Commit
  4. Fetch and Rebase against origin master|districts|progreports (this is team specific)
  5. Git push your branch from your local topic branch to an origin remote topic branch
  6. This crazy thing:
    arc diff --trace --base 'arc:verbose, git:branch-unique(origin/master), arc:prompt' "$(git merge-base origin/master HEAD)"
    
  7. Sing "My Way"

After pushing any subsequent commits to any remote branch it automatically closes your phabricator review. You can reopen it in the ui. then you can revise it (pick the phabricator Diff id instead of D67113 below:

arc diff --trace --base 'arc:verbose, git:branch-unique(origin/master), arc:prompt' "$(git merge-base origin/master HEAD)" --update D1234

Pretty much no one in the organization besides me uses this workflow. I prefer the adjective "innovative" rather than "outré" as it avoids having to type the accute accent.

Comparison Table

Phabricator GitHub/GitFlow/Git Branches
Phabricator is based on diffs/patches, not tied to the git history/git tree. This is not inherently better or worse. Or maybe it is? I make no value judgement here. GitHub/GitFlow is based on branches and commits, intrinsically tied to git history/git tree. This is not inherently better or worse. Or maybe it is? I make no value judgement here.
You can munge around in your diffs, smash up history, rebase easily, reorder diffs, and pluck out and land top-of-stack diffs. Generally, stacked diffs are a little simpler to manage here, but when things go wrong, they can go very wrong! You have to be a lot more careful about preserving git history. Generally, stacked diffs can be a bit more annoying, but it’s a lot harder to really break things. You can still do all the things that you can do in Phab, though often it’s required to change history.
Phabricator is really hard to learn for those coming from the GitHub/GitFlow mindset. It takes a very long time (months, maybe years) to go from “Familiar” to “Proficient”, when also learning the rest of the code base. And, for many, it often isn’t worth the ROI to get to “Mastered”. GitHub/GitFlow is hard to learn for those coming from the Phabricator mindset. It takes a while, but not nearly as long (a week to a month or two), to get from “Familiar” to “Proficient”. “Mastery” is much more attainable.
When learning Phabricator, we have lots of really great internal support and documentation to help you. But there are few resources and support in the wider community. When learning GitHub/GitFlow, we have little internal support and documentation to help you. But there are ample resources and large bodies of support in the wider community.
Our set up-is very custom to us. Skills here aren’t very transferable to future organizations, and virtually every new employee will need a lot of time to ramp up. This is the industry standard. Most new employees are already proficient in this type of workflow. Learning it while here will be valuable if and when you leave Khan Academy.
When you get stuck, it often costs several hours to whole days to get unstuck. Not very many in-house experts; less outside of KA. When you get stuck, problems are usually very easy to solve through Google or the many available resources. Many in-house experts; many industry experts as well.
Higher cognitive load around understanding what’s going on. Lower cognitive load around understanding what’s going on.
“When learning Phabricator, I totally had imposter syndrome!” “When learning GitHub, branching, and GitFlow, I totally had imposter syndrome!”

Cheatsheet

instead of this common command phabricator/webapp command
git pull git p
git merge git m
git checkout git co
N/A git gsu
N/A git db, !git deploy-branch
N/A git rb, !git review-branch

What the heck?

GitHub: Pull requests (commits and branches)

Phabricator: Revisions (everything is a diff)

WHY DOES THAT MATTER? For basic usage, it doesn't

Instead of pushing a branch and opening a pull request, you simply run arc diff

Hello Arc

$ git checkout -b add-toaster-oven
$ emacs
$ git commit
$ arc diff

ADDRESSING REVIEW FEEDBACK

  • Make your changes
  • Create a new commit on top of your previously reviewed change
  • Run arc diff again
  • Note the lack of force-pushing

WHEN YOU GET LOST

  • When a rebase leaves you in a strange state
  • When you're working on multiple feature branches
  • Use arc which to get a sense of your surroundings
$ arc which

UPDATING AN EXISTING REVISION

If you lose your commit message link and your tree changes

Use the --update revision-id

$ arc diff --update D123

CHANGING WHAT'S IN THE DIFF

By default, looks at all local changes between your upstream branch (e.g. origin/master) and HEAD

Roughly equivalent to:

$ git diff $(git merge-base origin/master HEAD)..HEAD

This confuses some people

To override, specify a base reference (branch name, SHA, symbolic ref):

$ arc diff <base-ref>

We set a global default, overriding upstream behavior, like this:

$ arc set-config base "git:merge-base(origin/master), arc:prompt"

COMPLEX SCENARIO

I'm working off a co-worker's topic branch feature-a, I can send just my changes up for review:

$ git fetch && git checkout feature-a
$ git checkout -b feature-b
$ emacs
$ git commit
$ arc diff feature-a

YOU GET TWO KNOBS

Only two things you usually want to tweak

  • The base, which specifies where your diff begins
  • Your differential revision (e.g. D456), which can be manually specified with
$ arc diff --update D456

IN REVIEW

  • Creates or updates a diff between HEAD and your upstream branch
  • If you're going to be pushing your branches pre- code-review, I highly recommend overriding your default base
  • If it recognizes an existing revision, updates
  • Otherwise it creates a new one
  • Behavior can be overriden by using --update D123 or specifying a base-ref

TURN DOWN THE NOISE

  • By default, you get an email from Phabricator for every action you're a part of, including actions you trigger.

  • To turn this off, go to your email preferences page

Ok, Ok, there are some small knobs

  • --nolint Skips linting
  • --nounit Skips unit tests
  • --trace More verbose
  • --update D62986 pick which diff to update
  • --base 'git:branch-unique(origin/districts)' pick which the branch as the basis for diffing

Wait, what was that last thing?

Diff against origin/master if it exists, and prompt if it doesn't:

git:merge-base(origin/master), arc:prompt

You can also debug the rulesets by using:

arc:verbose
  • By default, arc diff will do: git merge-base origin/master HEAD..HEAD That's a fancy way of saying "all the commits on the current branch that you haven't pushed yet". So, to create a revision in Git, run:
$ nano source_code.c  # Make changes.
$ git commit -a       # Commit changes.
$ arc diff            # Creates a new revision out of ALL unpushed commits on
                      # this branch.

The git commit step is optional. If there are uncommitted changes in the working copy then Arcanist will ask you to create a commit from them.

Since it uses all the commits on the branch, you can make several commits before sending your changes for review if you prefer.

You can specify a different commit range instead by running:

$ arc diff <commit>
This means to use the range:

git merge-base <commit> HEAD..HEAD However, this is a relatively advanced feature. The default is usually correct if you aren't creating branches-on-branches, juggling remotes, etc.

Warning: not always appropriate!

# assume you are on a branch like DIST-1817
git fetch origin districts
git pull --rebase origin districts -s recursive -X theirs
arc diff --trace --base 'arc:verbose, git:branch-unique(origin/districts), arc:prompt' "$(git merge-base origin/master HEAD)"
# Make some local changes
arc diff --trace --base 'arc:verbose, git:branch-unique(origin/districts), arc:prompt' "$(git merge-base origin/master HEAD)" --update D123

git branch --set-upstream-to districts
arc land --onto districts

Pushing the limits

git push will close any open phabricator review containing that diff. You then need to reopen it in the ui. Hooray.

Taking over someone else's phabricator revision

export DEPLOY_BRANCH=steve-JIRA-1111
git co D64718 # this calls revisionid-to-diffid.sh in webapp only
# You are now in a detached head state
git switch -c "${DEPLOY_BRANCH}"
git fetch origin master
git pull --rebase origin master -s recursive -X theirs
git push origin "${DEPLOY_BRANCH}" -u
git co master
git p
git co "${DEPLOY_BRANCH}"
git diff master..."${DEPLOY_BRANCH}" #check if changes are as expected

This page describes a Git workflow that Alan G Pierce preferred over the typical workflow. For an explanation of the default Phabricator/Git workflow, see the other documentation.

Motivation

By default, arc diff and arc land perform a sequence of steps that assume a particular Git workflow. These have a few problems:

  • Because the default behavior is a bit "magical" and depends heavily on git state, sometimes direct git operations will cause arc to get confused.
  • The default workflow doesn't make it very easy to maintain a chain of code reviews. In order to have n n commits out for review, you need an n n-level deep branch hierarchy, which isn't fun to maintain.

My workflow uses more explicit steps so that you can use normal git operations to maintain a stack of commits.

Differences from the default Phabricator workflow

  • Instead of a code review corresponding to a range of commits, a code review always corresponds to exactly one commit. This means that updating a commit requires using git commit --amend or equivalent.
  • There are no requirements for the branch structure, so it should be fairly easy to adapt this workflow to any alternative workflow that you prefer.

Submitting a diff

First, write your code and commit it. To submit it for code review, run this command to submit your HEAD commit as a code review:

arc diff HEAD~1
Details

The argument to arc diff is the "base" of your code review. The default behavior is to use your upstream as the base, so it takes all commits between your upstream and your current HEAD and creates a code review out of them. HEAD~1 refers to the commit before the current commit, so arc diff HEAD~1 is a way to say "take the current commit and submit it as a code review". You can also write HEAD~ or HEAD^, both of which are equivalent to HEAD~1. Normally you'd do this on a named branch, but nothing stops you from doing this on a detached HEAD or in the middle of an interactive rebase.

Updating a diff

To update the code for a diff (e.g. responding to code review feedback), just modify it and re-submit it in the same way. For example:

[make changes]
git add [files]
git commit --amend
arc diff HEAD~1
Details

arc diff looks for a "Differential Revision" line in your commit message to figure out what code review it corresponds to and uses that one. As long as you keep that line in the commit message the same, arc diff HEAD~1 will update the review. Similarly, if you want to start a review over, you can delete the "Differential Revision" and you'll get a new review. Again, there's nothing stopping you from doing this in the middle of an interactive rebase or in any other branch situation. To update any commit metadata (e.g. title, commit message, test plan, reviewers, or subscribers), you need to update the review in Phabricator. The arc amend step below will eventually pull the changes into the commit message.

Landing a diff

Instead of arc land, you take a more explicit sequence of steps that's equivalent.

Details

arc land takes several steps that we need to replace:

  1. Pull in changes from the Phabricator review to the commit message. arc amend is the way to do only this step.
  2. Squash all commits in the review into a single commit. This isn't necessary for us because we only ever have one commit per review.
  3. Merge from your upstream branch. This is replaced with an explicit rebase.
  4. Push your work to the upstream branch.

When you're ready to land a commit, you should update to the latest code and deal with any potential merge conflicts. Here's how to do it in my rebase-based workflow:

git fetch
git rebase origin/master

You then need to tell arc to update your commit message from the Phabricator review:

arc amend
Details

Running arc amend isn't critical, but it's good manners since it puts a "Reviewed by" line in your commit message, pulls in the commit message from Phabricator, and double-checks that your commit is actually reviewed. I usually run arc browse && arc amend to open the Phabricator review in the browser so that I can double-check that I addressed everything. As usual, this operates on your current (HEAD) commit and works regardless of your branch state. The next step is to push the commit, but the details depend on the repository you're pushing to.

Landing in non-webapp repositories

This line pushes your current position (and everything between your position and origin/master) to the master branch in GitHub:

git push origin HEAD:master
Details

origin is the remote that you're pushing to. HEAD refers to your current position. master refers to the fact that you're updating the remote master branch.

Landing in webapp

To instruct the deploy system to deploy your work, you need to create some branch in the GitHub repo with your work, then instruct the deploy system to deploy that branch. First, as soon as it's my turn to deploy, I rebase again onto origin/master so that the deploy system has nothing to merge. Then, I push a new branch using a line like this:

git push origin HEAD:refs/heads/alan-deploy-aug6
Details

As before, origin means that I'm pushing to the GitHub repo, and HEAD means that my current branch position is what I'm pushing. Since the remote branch generally doesn't exist yet, I can't just say git push origin HEAD:alan-deploy-aug6, since Git doesn't know whether alan-deploy-aug6 refers to a branch or something else (e.g. a tag). The incantation refs/heads/alan-deploy-aug6 means "the branch alan-deploy-aug6 ". I create a new branch each time instead of re-using the same branch name mostly as a matter of discipline. I'm not conceptually updating anything in the GitHub repo; I'm just adding new commits that I want to be able to point to. I eventually delete old branches. Then I say something like "sun, deploy alan-deploy-aug6" and the deploy begins.

Managing multiple commits

This is where the workflow starts to become really useful. This workflow makes it easy to break commits down into smaller pieces without much additional friction, and there have been several times where I have had branches with more than 10 dependent commits in various stages of code review at once. To maintain a sequence of commits that are all out for code review, you can simply create multiple commits on top of each other within the work of a single branch. You can then use all of the features of interactive rebase: you can add new commits anywhere in the stack, you can reorder commits, you can drop commits, you can combine commits, and you can update intermediate commits (e.g. to address review comments). As an example, here's how to update a commit in response to a review comment:

git rebase -i origin/master
[go to the commit you want to jump to and change "pick" to "edit"]
[it moves to that commit]
[make changes to your code]
git commit -a --amend
arc diff HEAD~1
git rebase --continue

A few notes when using this:

  • When landing a stack of commits, you need to use interactive rebase to run arc amend on each one individually. However, git push always operates on the range commits up to your current HEAD.
  • Similarly, there's no way to do a bulk update of all of your commits in code review aside from doing interactive rebase and running arc diff HEAD~1 at each point.
  • Because git commits are immutable, you must use interactive rebase (or something equivalent) to edit earlier commits, since you need to update the entire stack of commits. For example, it's incorrect to use git checkout to move to an earlier commit and to try to just run git commit --amend to update it; when you move back to your main branch, you will lose that work you just did.

More opinions from Alan

These are significantly more controversial, but I have some opinions on how to use and think about Git that somewhat motivate this workflow. You may find some of the arguments compelling as well.

  • I prefer rebase and avoid merge in most situations. I find linear history significantly easier to read and understand, and when I deploy work, I prefer commits to tell a relatively polished story of how the code steps through a sequence of intermediate good states, rather than trying to record the intermediate mistakes that I made along the way.
  • In most (but not all) situations, I do all of my work on a single named branch, even I have commits that are unrelated to each other. Having a single branch makes it easier to keep up to date with origin/master and to keep track of everything I have out for code review, and it makes it so I'm implicitly testing all of my work-in-progress changes whenever I do any testing. If I want to land some commits before others, I can use interactive rebase to move those commits earlier in the ordering and just push those.
  • I don't use upstreams. I have been annoyed by misconfigured upstreams significantly more than I have benefitted from properly-configured upstreams. I am almost never in a situation where I can't remember the name of the branch to rebase over or the branch to push to, so I view them as a minor convenience.
  • I generally don't have local and remote branches with the same name. Rather than using long-lived remote branches, I prefer a continuous integration style where any intermediate step could land on the master branch, and anything that isn't ready to show to users is behind a flag in the code.
  • In particular, I don't have a local master branch. Since I always develop on a named branch (which is required when working in webapp and recommended in other repositories), I don't find it useful to distinguish between a local state of master and the state of master in the GitHub repo. I've also dealt with and seen a lot of confusion from master and origin/master getting out of sync.

Have any questions or comments? See anything wrong? Let me know in the discussion area below!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment