Skip to content

Instantly share code, notes, and snippets.

View chapmanb's full-sized avatar

Brad Chapman chapmanb

View GitHub Profile
@chapmanb
chapmanb / CCDS_IDTxGEN_SeqCap-hg19.bed
Last active June 6, 2018 10:15
Combined exome regions files, prepared with hg38 and lifted over to hg19
This file has been truncated, but you can view the full file.
chr1 14395 14616 .
chr1 14672 14802 .
chr1 14832 14992 .
chr1 15012 15148 .
chr1 15611 16024 .
chr1 16541 16877 .
chr1 16893 17032 .
chr1 17251 17374 .
chr1 17403 17519 .
chr1 17546 17716 .
@chapmanb
chapmanb / svCandidateGenerationStats.tsv
Created April 27, 2018 13:57
huD57BBF manta runtime stats
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 1 column, instead of 2. in line 1.
EdgeStatsReport
SVGenTotalHours 59.1786h wall, 0.5012h user + 0.0591h system = 0.5603h CPU (0.95%)
NonEdgeHours 0.0197h wall, 0.0115h user + 0.0038h system = 0.0152h CPU (77.53%)
[AllEdges]
InputEdgeCount 101415
InputEdgeCandidatesPerEdge:
0 52679
1 43118
2 5233
@chapmanb
chapmanb / cromwell-bcbio-nodocker.log
Created April 20, 2018 19:26
Cromwell bcbio no docker debugging
Script started on Fri 20 Apr 2018 03:17:41 PM EDT
$ bash run_cromwell.sh
Running: export PATH=$PATH:/usr/local/share/bcbio-vm/bin && cromwell run --type CWL --inputs /home/chapmanb/drive/work/cwl/test_bcbio_cwl/somatic/cromwell_work/somatic-workflow-nodocker/main-somatic-samples.json /home/chapmanb/drive/work/cwl/test_bcbio_cwl/somatic/cromwell_work/somatic-workflow-nodocker/main-somatic.cwl 2>&1 | tee -a /home/chapmanb/drive/work/cwl/test_bcbio_cwl/somatic/cromwell_work/somatic-cromwell.log
[2018-04-20 15:17:47,33] [info] Running with database db.url = jdbc:hsqldb:mem:fea1cf43-9434-4085-b283-d40102cc76c8;shutdown=false;hsqldb.tx=mvcc
[2018-04-20 15:17:52,71] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-04-20 15:17:52,72] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-04-20 15:17:52,83] [info] Running with database db.url = jdbc:hsqldb:mem:7405ca2a-863a-4e13-afad-b8e006f849e1;shutdown=false;hsqldb.tx=mvcc
[2018-04-20 15:1
@chapmanb
chapmanb / untrimmed_fastp.txt
Last active March 13, 2018 09:58
fastp and atropos 3' polyX trimming
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCACCCCCCCCCCCCC
TGTCTCAAGAAATAAATAATTAATTAATTAATAATGTGATTTCCC TTAAAGATTTCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGTTTTTTGTTTTTTTTTTGTTTTGTTT
AAAAAAAAAAAAAAAAAAAAAAAAAGAAAGAAAAAAAAAAAAAAA GAAAAAAAATAACCTAAAGAAAAAAAAAACTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGAAAAAAAAAAAAAAGA
GGGCCCTGCCCCACAGGCCCCGCCCCCAGAGGCAGCTCCCCCCCCCCCCGC CGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCGCCCACCCCCCCCCCCCCCCCCCCCGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCGCCACCCC
TGATCCACCCACTTTGGCCTCTGCGCCCG GCCTATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGTTT
CATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT TTTTTTTGTTTTTGTGTGGTTGGTTTTTTTTTTTTTTTTTTTTTCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCTTTTTTTTTT
AGCTAGAAAAATATTATAATCCAGTCATTTTTAACTTCATAAATATGGACT TATCATACTTTTCATGCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
@chapmanb
chapmanb / Exome-AZ_V2-withalts.bed
Last active February 26, 2018 19:42
AZ exome BED file annotation with hg38 alts
chr1 13352 13689 DDX11L1 .
chr1 14312 15178 WASH7P .
chr1 15581 16054 WASH7P .
chr1 16511 17105 WASH7P .
chr1 17182 18492 WASH7P .
chr1 19000 19292 WASH7P .
chr1 20472 20758 WASH7P .
chr1 24390 24995 WASH7P .
chr1 29127 29475 WASH7P .
chr1 30198 30593 MIR1302-2 .
@chapmanb
chapmanb / workflow_experiences.md
Created February 6, 2018 16:58
Workflow experiences for GA4GH Workflow execution challenge writeup -- bcbio

Workflow Experiences

Achieving cross-platform portability requires a combination of standards, platform support and workflow conformance. One of the goals of the challenge is to assess and develop each of these areas: improve standards to avoid interoperability challenges, improve platforms to better handle cases where workflow runs differ, and provide better workflows. This section describes experiences from workflow authors on areas where they found interoperability challenges and how they overcame them.

@chapmanb
chapmanb / explore_batching_vcs.py
Created December 13, 2017 03:15
CWL intermediate staging with same filenames
import json
import subprocess
job = json.loads(subprocess.check_output(["dx", "describe", "job-F8fgxBQ0XFJF2G9k5FqYg08Z", "--json"]))
input_files = [x["primaryFile"]["$dnanexus_link"] for x in job["input"]["config__algorithm__validate"]]
output_files = [x[0]["config__algorithm__validate"]["primaryFile"]["$dnanexus_link"] for x in job["output"]["batch_rec"]]
for f in input_files:
f_d = json.loads(subprocess.check_output(["dx", "describe", f, "--json"]))
@chapmanb
chapmanb / big2017_bcbiocwl_abstract.md
Last active November 16, 2017 19:47
BIG meeting at MIT: bcbio and common workflow language
Topic: MIT BIG meeting: bcbio and CWL (http://openwetware.org/wiki/BioMicroCenter:BIG_meeting)
Time: Nov 2, 2017 11:00 AM (GMT-4:00) Eastern Time (US and Canada)
Location: MIT Koch Biology Building 68-156 (http://whereis.mit.edu/?go=68) or online
Online: https://zoom.us/j/223311944
Telephone: US: +1 646 876 9923 or +1 669 900 6833 or +1 408 638 0968 or international https://zoom.us/zoomconference?m=qoCDInqg9wlWdANSN9knRp5E41dRbRhy
Meeting ID: 223 311 944
Recording: https://youtu.be/nJEDS9Qol8M
Slides: https://github.com/chapmanb/bcbb/blob/master/talks/big2017_bcbio_cwl/big2017_bcbio_cwl.pdf
{
"process_alignment_rec" : {
"files" : [
{
"size" : 1363775,
"dirname" : "/home/chapmanb/drive/work/cwl/test_bcbio_cwl/gvcf_joint/bunny_work/main-gvcf-joint-2017-10-15-122622.38/root/alignment/1/prep_align_inputs/align_prep",
"checksum" : "sha1$9378d84885d46a29c2346a58379ca08b99ce3c79",
"class" : "File",
"format" : null,
"secondaryFiles" : [
ResourceNotFound: The specified folder could not be found in project-F6vf5fj0BV9650154B13Vk9j, code 404. Request Time=1507085723.48, Request ID=1507085723834-786503
The destination folder does not exist
Compiling tools/workflows for each step in the workflow
alignment_to_rec
alignment
ResourceNotFound: The specified folder could not be found in project-F6vf5fj0BV9650154B13Vk9j, code 404. Request Time=1507085741.25, Request ID=1507085741568-372221
The destination folder does not exist
Compiling tools/workflows for each step in the workflow