Skip to content

Instantly share code, notes, and snippets.

View junjun-zhang's full-sized avatar

Junjun Zhang junjun-zhang

  • Adela
  • Toronto, Canada
  • 01:48 (UTC -12:00)
View GitHub Profile
@junjun-zhang
junjun-zhang / 1_create_nodes_for_dag_1.sh
Created September 26, 2017 18:56
POC: Implement Directed Acyclic Graph DB with Version Control using Elasticsearch
# 1.create_nodes_for_dag.1.sh
curl -XPUT 'localhost:9200/.d.dag.data.a/A/a1.1' -H 'Content-Type: application/json' -d'
{
"type": "A",
"id": "a1",
"p_id": [ ],
"a_id": [ ],
"data": { }
}
'
Demo use cases
Use case 1
Find commonly mutated genes with high impact mutations between Ovary cancer and Prostate cancer;
how many of these common genes are in the Cancer Gene Census? Save this gene set and share it with
a colleague.
Use case 2
A colleague identified 458 ‘interesting’ mutations and shared the list with you: https://icgc.org/Z37.
You may wonder what are the genes affected by these mutations and whether these genes over-represented
TP53, PIK3CA, APC, VHL, KRAS, ARID1A, PBRM1, NAV3, EGFR, NF1, PIK3R1,
CDKN2A, SETD2, ATM, GATA3, RB1, NOTCH1, FBXW7, MTOR, CTNNB1, DNMT3A,
ATRX, LRRK2, MAP3K1, FLT3, MALAT1, BRCA2, TSHZ3, KEAP1, CDH1, ARHGAP35,
EP300, CTCF, POLQ, ATR, NSD1, NFE2L2, TAF1, SETBP1, STAG2, NCOR1, EPHA3,
ERBB4, USP9X, BAP1, KDM6A, KDM5C, TLR4, BRCA1, PDGFRA, NPM1, TSHZ2,
PIK3CG, HGF, RUNX1, ARID5B, TET2, EPHB6, NRAS, IDH1, BRAF, FGFR2,
PPP2R1A, CDK12, SMC1A, MECOM, TBX3, MAP2K4, EPPK1, KIT, SMAD4, ASXL1,
SF3B1, AR, LIFR, SMC3, TGFBR2, SIN3A, RPL22, WT1, FGFR3, PTPN11, ACVR1B,
STK11, AKT1, PRX, CHEK2, SMAD2, AXIN2, RAD21, AJUBA, U2AF1, FOXA1, IDH2,
NFE2L3, EIF4A2, PHF6, TBL1XR1, EZH2, CRIPAK, SOX9, CDKN1B, CBFB, MAPK8IP1,
@junjun-zhang
junjun-zhang / gist:54b8c7365af1a5f86694
Last active August 29, 2015 14:13
Example of reverse_nested aggregation, a way to count parent docs at a nested level. This maybe a solution to the 'jumpy number' issue in consequence type etc facets.
curl -XGET "http://192.170.232.56:8200/gdc_v12/participant-centric/_search?pretty=1" -d '
{
"aggs":{
"project":{
"terms":{
"field":"admin.disease_code",
"size":1000
},
"aggs":{
"file_size":{