Skip to content

Instantly share code, notes, and snippets.


Daniel Himmelstein dhimmel

View GitHub Profile
View openalex-metagraph.drawio
<mxfile host="Electron" modified="2022-01-11T22:24:44.620Z" agent="5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Electron/16.0.5 Safari/537.36" etag="3d_x8ajHgV-OkkuHk2Cl" version="16.1.2" type="device"><diagram id="CRBQrBPZslJF9kaArlf7" name="Page-1">7Vttj5s4EP41ke4+JALzEvJx3+5aqadbqVLb+7Qy2El8SzBnzCbprz8TzIuJQ3YbCGm6W6mLB3sYZjwzz4zZkXW32vzJYLz8iyIcjoCBNiPrfgSAaTmu+JVRtpIym1o5ZcEIkrSK8Jl8x5JoSGpKEE6UiZzSkJNYJQY0inDAFRpkjK7VaXMaqk+N4QLvET4HMNynfiWIL3Oq5xgV/QMmi2XxZNOQd1awmCwJyRIiuq6RrIeRdcco5fnVanOHw0x7hV7ydX8cuFsKxnDEX7Pgg/v3Atyv14/if299/yV6/BiPzZkUjm+LN8ZIKEAOKeNLuqARDB8q6i2jaYRwxtYQo2rOJ0pjQTQF8V/M+VZaE6acCtKSr0J5F28I/5Ytn0wdOfxHcsuu7zf1wbYYRJxt81WW4xSEbJ05MUBJqBbvRsrqR8zICnPMJDF/9+yFD+pUkhKasgC3KVLuTcgWmLfNc0vTC6fBVIjDtmIhwyHk5EUVBMrNuyjnVfYVF9LEbzB3sRdfYJjKRyU8RUR4mG4bfIK+8GfFdDAki0hcB0JDmR5vXzDjRPjLjbyxIgjluwQn5Dv0d/wyZceURHz3Qs7tyLkv1Z8xwJuRxpnl4sqF6oZp2c772pXsjQlwLRl/ZEAa2/nw1fqXzB+zt6k4j02VrasyoPN5IrZF03ylhD9uUXPPoCPghkJHt4i8KBZ1/0uzUHMrlM3H0ow3YkaI
dhimmel / weighted-r2.R
Last active Sep 5, 2022
Computing the R-squared of a linear regression model with weighted observations in R
View weighted-r2.R
# Compare four methods for computing the R-squared (R2, coefficient of determination)
# with wieghted observations for a linear regression model in R.
# This work was written by Daniel Himmelstein (@dhimmel) with guidance
# from Alex Pankov (@a-pankov). It is released as CC0 (public domain).
get_r2_cor <- function(y, y_pred, w) {
# Calculate R2 using the correlation coefficient method
xy = cbind(y, y_pred)
return(boot::corr(d=xy, w=w) ^ 2)
dhimmel / benevolent-ai-metagraph-data-sources.drawio
Last active May 9, 2022
Metagraph of the Benevolant AI knowledge graph from Paliwal et al 2020
View benevolent-ai-metagraph-data-sources.drawio
<mxfile host="" modified="2021-06-08T17:05:33.560Z" agent="5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36" etag="oN2ExXKtZoTKyfjxyag6" version="14.7.6" type="device"><diagram id="CpBVq6cUYyMfxSufbEmF" name="Page-1">5Vxbd6q6Fv41HeOch3aEAF4eFail22CtdHXbN4oUowgeQBF+/ZnhouK9u9ius05vkpDMzMzLN5NJ6A0vzVYd35iPiTeynBuMRqsbXr7BmOPrGD5YTZzXwE9WY/t0lNdtKgY0sfJKlNcu6MgKSg1Dz3NCOi9Xmp7rWmZYqjN834vKzT48pzzq3LCtvYqBaTj7ta90FI6z2oaINvUPFrXHxcgcyu/MjKJxXhGMjZEXbVXxyg0v+Z4XZlezlWQ5THqFXLJ+90furhnzLTe8pANvL8YT7i/nXR74b7b+13zYS275nLcwLibs+eHYsz3XcLqeN4c67oZvT6wwjHPlGIvQg6pxOHPyuxkNa7QnyQ1reVXgLXzTOsEPV+jY8G0rPNEQi2sRgvFZ3swK/Rg6+pZjhHRZ5sTIjcBet9vICS5yUX1CbEJGd2k4i3ykG1xzgN/2O1zY7AJmH1KTzoEbzy3uwmDrBuseflHzl9Lp3GBp/8azZZghsL6nLCbzrvEOTldSiOFQ24VrE+Rv+VCxtBgzhtPKb8zoaMRotH0roInxntJDjGuPumEqLbF9I8pr5TIC1uqQx+WdN3a+rfbjNrevuZw6uhMF1MxI5aCRG+nFqs1pP7G5bAjf1spUb3GzTML7+AjA5nZtY83iPzcXsQovs1Y0/Jup6a7ZqOflYVqu4Vpelle5HtNCvFV4snwK02DWkNZV7rNfcMXD+hIbqKQvscDcgkSGEXmvjdJavm/
dhimmel /
Created May 4, 2022
Diagnosing the Ubuntu top left quarter display issue
dhimmel /
Last active Mar 2, 2022
DISEASES v2 Review by Daniel Himmelstein

DISEASES 2.0 Review


DISEASES 2.0: a weekly updated database of disease–gene associations from text mining and data integration
Dhouha Grissa, Alexander Junge, Tudor I Oprea, Lars Juhl Jensen
bioRxiv (2021-12-09)
DOI: 10.1101/2021.12.07.471296

The webapp and latest downloads are available at . Version 1 is described in the 2015 publication.

dhimmel /
Created Dec 1, 2020
Review of TIGA (Target illumination GWAS analytics) preprint v1
dhimmel /
Last active May 4, 2020
Review of the "Rigor and Transparency Index" manuscript (

Daniel Himmelstein's review of preprint v2

Review of version 2 of the following preprint:

Rigor and Transparency Index, a new metric of quality for assessing biological and medical science methods
Joe Menke, Martijn Roelandse, Burak Ozyurt, Maryann Martone, Anita Bandrowski
bioRxiv (2020-01-18)
DOI: 10.1101/2020.01.15.908111

The study introduces an automated method called SciScore to detect whether an article's methods section mentions any of 15 categories, such as a consent statement or an organism. These metrics are combined to create a single score for each article called the "Rigor and Transparency Index". The authors applied the method to the PubMed Central Open Access subset with over 1 million articles to identify trends in the level of details provided by method sections.

dhimmel / cypher-edge-swap.adoc
Last active Dec 28, 2018
Randomized edge swaps in cypher
View cypher-edge-swap.adoc

Degree-Preserving Edge-Swap


We designed a hetnet for drug repurposing that contains 50 thousand nodes (of 10 labels) and 3 million relationships (of 27 types). And we’ve chosen neo4j for handling network storage and interaction.

dhimmel / hetmech-query-node-pair.ipynb
Created Oct 23, 2018
Query relationship between the FTO gene and obesity using hetmech
View hetmech-query-node-pair.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
dhimmel / hetionet-v1.0-metaedge-xswap-stats.tsv
Last active Oct 2, 2018
Analytical derivation of the prior XSwap probability of a hetnet edge
View hetionet-v1.0-metaedge-xswap-stats.tsv
metaedge abbreviation n_edges n_connected_source_nodes n_connected_target_nodes n_source_wedges n_target_wedges n_wedges n_valid_xswaps
Anatomy–downregulates–Gene AdG 102240 36 15097 173440264 493897 173934161 5052523519
Anatomy–expresses–Gene AeG 526407 241 18094 2290279787 10749138 2301028925 136250872696
Anatomy–upregulates–Gene AuG 97848 36 15929 149352969 359661 149712630 4637353998
Compound–binds–Gene CbG 11571 1389 1689 104024 476540 580564 66357671
Compound–causes–Side Effect CcSE 138944 1071 5701 16998055 16764774 33762829 9618885267
Compound–downregulates–Gene CdG 21102 734 2880 1683615 291789 1975404 220661247
Compound–palliates–Disease CpD 390 221 50 326 2857 3183 72672
Compound–resembles–Compound CrC 12972 1281 1281 120047 120047 240094 83889812
Compound–treats–Disease CtD 755 387 77 1420 8070 9490 275145