Skip to content

Instantly share code, notes, and snippets.

@tgirke
Last active June 11, 2023 23:09
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save tgirke/972f78e6bd1684319f5a6f129a466d78 to your computer and use it in GitHub Desktop.
Save tgirke/972f78e6bd1684319f5a6f129a466d78 to your computer and use it in GitHub Desktop.
New Bioc-GitHub Source Control

Bioconductor-GitHub Package Sync

This page outlines how to maintain and sync R packages to both GitHub and Bioconductor's new git source control system.

Note: the master branch has been renamed to devel branch following the instructions here (Mar-2023).


Table of Content

  1. Commit and push your work
  2. Clone new repos instance to local
  3. Steps after new Bioc release
  4. First time sync

1. Commit and push your work

The following instructions are from here. If this is the first sync of a GitHub/Bioc repository, follow the instructions in the First time sync section first.

1.1. Do some work in devel branch

git checkout devel # Switch to devel (prev master) branch 
git pull upstream devel # Sync with Bioc
git pull origin devel # Sync with GitHub
## -> do coding work here. If more complex, create a temp branch first
git commit -am "some meaningful message"
git push origin devel # push to github
git push upstream devel # push to Bioc

1.2. If necessary make changes to release branch

Example for RELEASE_3_17:

git checkout RELEASE_3_17 # Switch to RELEASE_3_17 branch 
git pull upstream RELEASE_3_17 # sync with Bioc
git pull origin RELEASE_3_17 # Sync with GitHub
## -> do coding work here. If more complex, create a temp branch first
git commit -am "some meaningful message"
git push origin RELEASE_3_17 # push to github
git push upstream RELEASE_3_17 # push to Bioc

1.3. Useful utilities

Create a copy of a local branch for major changes and then merge back to source branch

git checkout old_branch
git branch new_branch
# -> do some work (edit/add/commit)
git checkout old_branch
git merge new_branch

Clone package from Bioconductor (e.g. to check updates)

git clone git@git.bioconductor.org:packages/<MyPackage>

Check differences among two directories with basic Linux diff command

diff -r --exclude=".git" dir1/ dir2/

2. Clone new repos instance to local

Instructions adopted from here

Clone repos from GitHub

git clone git@github.com:<developer>/<MyPackage>.git

Add Bioc/upstream remote

git remote add upstream git@git.bioconductor.org:packages/<MyPackage>.git

Fetch content from Bioc/upstream

git fetch upstream

Merge devel branch from Bioc/upstream with GitHub/origin

git merge upstream/devel

If necessary add new branch (e.g. from Bioc) to GitHub

git checkout -b RELEASE_3_17 upstream/RELEASE_3_17 # Checkout new branch with name and tracking from Bioc (here RELEASE_3_17)
git push -u origin RELEASE_3_17 # Push new branch to GitHub

Now do some work as described in Secton 1.

3. Steps after new Bioc release

The following creates first a backup/freezed copy of the previous devel branch; then fetches the new release branch created by Bioc; and syncs any changes in upstream/Bioc and origin/GitHub with local devel instance. For additional details see corresponding section in BiocDevel Book here.

git checkout devel
git checkout -b devel_3_16 # Creates copy of devel branch where extension in name is number of corresponding Bioc release
git fetch upstream # Fetch content from Bioc/upstream
git checkout -b RELEASE_3_17 upstream/RELEASE_3_17 # Checkout new branch with name and tracking from Bioc (here RELEASE_3_17)
git push -u origin RELEASE_3_17 # Push new branch to GitHub; then repeat for other missing branches if any
git checkout devel
git pull upstream devel # Sync with new devel on Bioc
git pull origin devel # Sync with GitHub
git push origin devel # push to github

After this contine under Commit/Push Section 1

4. First time sync

Follow these instructions when syncing a GitHub repos with a Bioc repo for the first time.

Instructions from here

4.1. Clone from GitHub

git clone https://github.com/MyUsername/MyPackage.git
cd MyPackage

4.2. Configure the “remotes” of the GitHub clone

git remote add upstream git@git.bioconductor.org:packages/<MyPackage>.git

4.3. Fetch updates from all (Bioconductor and GitHub) remotes

git fetch --all

4.4. Merge updates from the GitHub (origin) remote

git checkout devel # Make sure you are on devel branch
git merge origin/devel 

4.5. Merge updates from the Bioconductor (upstream) remote

git merge upstream/devel # resolve potential conflicts, e.g. in DESCRIPTION
git commit -am "resolved conflicts"
git merge upstream/devel 
# git merge --allow-unrelated-histories upstream/devel # use this line instead of previous one with git version >= 2.9 

4.6 Check for duplicated commits

The following command returns a sorted summary of git commit logs.

git log --oneline

Shell command to identify duplicate commits based on patch-id. Commits with identical patch ids are very likely to have identical content (derived from last comment here).

git rev-list devel | xargs -r -L1 git diff-tree -m -p | git patch-id | sort | uniq -w40 -D | cut -c42-80  | xargs -r git log --no-walk --pretty=format:"%h %ad %an (%cn) %s" --date-order --date=iso

After duplicates have been identified one can write their log messages to files and check with diff or vimdiff whether their content is identical. Note, argument --no-pager turns paging off which allows to redirect output to file.

git --no-pager show <commit_id1> > zzz1
git --no-pager show <commit_id2> > zzz2
vimdiff zzz1 zzz2

If there are duplicated commits then a brach swap, as recommended here, may be the easiest solution, meaning the current branch will be replaced with the one from Bicoonductor. If it is preferred to remove the duplicated commits (e.g. duplicates have been committed to Bioconductor already, which should not be possible in the future) then one can eliminate them via git merge --squash as follows.

git checkout devel
git pull upstream devel # just in case
git reset --hard <commit_id> # Reset the current branch to the commit right before dups started
git merge --squash HEAD@{1} # Squashes duplicated commits from chosen <commit_id> to HEAD@{1} (state right before previous reset step)
git commit -am "fixed github/bioc sync problem" # Commit squashed changes
git push upstream devel # Push to bioc

After the removal of duplicated commits continue with the branch swap:

git checkout -b devel_backup upstream/devel # Checkout new branch with temp name and tracking from Bioc devel
git branch -m devel devel_deprecated # Assign to local branch some archive name
git branch -m devel_backup devel # Rename branch with temp name to the one you wish to use (here devel)
git push -f origin devel # Force push to github

4.7. Push to both Bioconductor and GitHub repositories

git push upstream devel # Pushes to Bioconductor
git push origin devel # Pushes to GitHub

4.8. Repeat steps A2/5/6/7 for current release branch

git checkout RELEASE_3_17
git merge upstream/RELEASE_3_17 # -> Already up-to-date
git merge origin/RELEASE_3_17 # -> Nothing to merge since 3_5 didn't exist on GitHub yet
git push upstream RELEASE_3_17
git push origin RELEASE_3_17

4.9. Check whether remotes are set up correctly.

If so the output of git remote -v should look like this

origin  https://github.com/MyUsername/MyPackage.git (fetch)
origin  https://github.com/MyUsername/MyPackage.git (push)
upstream        git@git.bioconductor.org:packages/MyPackage.git (fetch)
upstream        git@git.bioconductor.org:packages/MyPackage.git (push)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment