Chris Miller chrisamiller

## bootcamp_docker.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                chrisamiller
                / bootcamp_docker.md
            
            
              Last active
              May 28, 2024 16:11
            
          
    Using Docker

On your laptop:
docker pull ubuntu

What is that doing? It's going to https://hub.docker.com/_/ubuntu and pulling down the image with the "latest" tag
docker run ubuntu


## germline_calling.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                chrisamiller
                / germline_calling.md
            
            
              Last active
              November 11, 2023 16:47
            
          
    Germline Variant Calling and filtering

Module objectives


Perform single-sample germline variant calling with GATK HaplotypeCaller and the GATK GVCF workflow on exome data
Perform joint genotype calling on exome data, including additional exomes from 1000 Genomes Project
Manipulate and Filter VCFs to remove artifacts and identify variants of interest

In this module we will use the GATK HaplotypeCaller to call germline variants from "normal" bams. For a more in-depth look, see this excellent GATK tutorial, provided by the Broad Institute.

  
## long_read_alignment.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                chrisamiller
                / long_read_alignment.md
            
            
              Last active
              November 10, 2023 23:31
            
              
                Long Read Alignment
              
          
    Long Read Alignment

Let's start by looking at some outputs of long read sequencing from the Oxford Nanopore (ONT) platform. These are sequences from the K562 cell line, prepared with the ONT cDNA sequencing kit (poly-A selected).  Off the machines, the data will consist of a FAST5 or POD5 file, which are a compressed representation of the raw signal. These are subsequently run through a basecalling algorithm (such as Dorado) to generate FASTQ files.
The choice of basecalling algorithm and parameters goes pretty deep, so we'll assume that reasonable choices have been made. For simplicity, we've also subset the data to include just small portions of the genome, including a few genes of interest.
Go ahead and pull down this fastq file:
wget https://storage.googleapis.com/bfx_workshop_tmp/k562_ont_raw.fastq.gz


## somatic_calling.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                chrisamiller
                / somatic_calling.md
            
            
              Last active
              November 11, 2023 13:20
            
              
                Somatic Variant Calling exercise
              
          
    Somatic Variant calling

Gather your inputs.

Start by gathering some data. Navigate to a somatic folder and pull down a set of input data from (human build38) from this location:
wget https://storage.googleapis.com/bfx_workshop_tmp/inputs.tar.gz


## cromwell_workflows.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                chrisamiller
                / cromwell_workflows.md
            
            
              Created
              February 4, 2021 21:21
            
          
    There are three inputs you need to run a workflow:

A .cwl file that contains the steps to be run
A .yaml file that gives the inputs to that CWL
A config file that tells cromwell about it's environment, how to submit jobs to the cluster, and where to stick the results

Let's start with #3 - the config file.  I've made this easy for you.  Create a directory where you want to run things, then inside of it, run the following command:
/storage1/fs1/timley/Active/aml_ppg/src/utilities/create_cromwell_config -o cromwell.config -l logs -d output -q timley -G compute-timley```


## new_employee_info.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              2 stars
            
          
                chrisamiller
                / new_employee_info.md
            
            
              Last active
              June 28, 2022 18:55
            
          
    New employee setup

Tasks


Join the mgibio Slack. Ask any user to invite you using your wustl address, or email c.a.miller@wustl.edu. Excellent place to post questions about anything and get answers. Useful channels include #bfx-workshop, #analysis-workflows, #cancergenomics, and #docker
Get compute1 access set up. This requires a ticket to the RIS Servicedesk requesting to be added to the appropriate compute and storage groups
VPN access Connect to msvpn.wusm.wustl.edu through Cisco AnyConnect. Use WUSTL key log in and submit request at https://it.wustl.edu/items/connect/
Set up compute1 config files (Need a link for this - env variables, etc)
Sign up for the bfx_workshop get on the [email list](https://outlook.office365.com/owa/bioinformatics@gowustl.onmicrosoft.com/groupsubscription.ashx?action=join&amp;source=MSExchange/LokiServer&amp;guid=


## New Employee info
## New employee setup
### Tasks

- **Join the mgibio Slack**. Ask any user to invite you using your wustl address, or email [c.a.miller@wustl.edu](mailto:c.a.miller@wustl.edu). Excellent place to post questions about anything and get answers. Useful channels include #bfx-workshop, #analysis-workflows, #cancergenomics, and #docker
- **Get compute1 access set up**. This requires a ticket to the [RIS Servicedesk](https://jira.ris.wustl.edu/servicedesk/customer/portal/1) requesting to be added to the appropriate compute and storage groups
- **VPN access** Connect to msvpn.wusm.wustl.edu through Cisco AnyConnect. Use WUSTL key log in and submit request at [https://it.wustl.edu/items/connect/](https://it.wustl.edu/items/connect/)
- **Set up compute1 config files** (_Need a link for this - env variables, etc_)
- **Sign up for the bfx_workshop** get on the [email list](https://outlook.office365.com/owa/bioinformatics@gowustl.onmicrosoft.com/groupsubscription.ashx?action=join&amp;source=MSExchange/LokiServer&amp;guid=

## oct_2020.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                chrisamiller
                / oct_2020.md
            
            
              Last active
              October 6, 2020 14:47
            
              
                Oct 2020 Hackathon
              
          
    RNAseq team (Sid)


Add RNAseq sanity/QC checks
https://github.com/genome/analysis-workflows/issues/904

Add input type fastqs to RNA seq
https://github.com/genome/analysis-workflows/issues/932


## lsf_and_docker_tutorial.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                chrisamiller
                / lsf_and_docker_tutorial.md
            
            
              Last active
              May 31, 2023 02:45
            
              
                LSF and Docker basics - bfx_workshop
              
          
    Compute cluster basics

Logging in

Open up a terminal (Terminal/iTerm on Mac, putty or WSL on Windows) and SSH into the cluster, replacing USERNAME with your WUSTL key.
ssh c.a.miller@compute1-client-3.ris.wustl.edu


## kallisto_to_degs.R
library(edgeR);
library(gplots);
library(RColorBrewer);
library(tximport);

# takes three arguments - config file, transcript to gene table, and output directory

# config file specifies the samples to import, groupings, and paths to abundance.tsv files from kallisto
# groups should be either 0 or 1
# header: sample \t group \t /path/to/abundance.tsv
	## New employee setup
	### Tasks

	- Join the mgibio Slack. Ask any user to invite you using your wustl address, or email [c.a.miller@wustl.edu](mailto:c.a.miller@wustl.edu). Excellent place to post questions about anything and get answers. Useful channels include #bfx-workshop, #analysis-workflows, #cancergenomics, and #docker
	- Get compute1 access set up. This requires a ticket to the [RIS Servicedesk](https://jira.ris.wustl.edu/servicedesk/customer/portal/1) requesting to be added to the appropriate compute and storage groups
	- VPN access Connect to msvpn.wusm.wustl.edu through Cisco AnyConnect. Use WUSTL key log in and submit request at [https://it.wustl.edu/items/connect/](https://it.wustl.edu/items/connect/)
	- Set up compute1 config files (_Need a link for this - env variables, etc_)
	- Sign up for the bfx_workshop get on the [email list](https://outlook.office365.com/owa/bioinformatics@gowustl.onmicrosoft.com/groupsubscription.ashx?action=join&source=MSExchange/LokiServer&guid=
	library(edgeR);
	library(gplots);
	library(RColorBrewer);
	library(tximport);

	# takes three arguments - config file, transcript to gene table, and output directory

	# config file specifies the samples to import, groupings, and paths to abundance.tsv files from kallisto
	# groups should be either 0 or 1
	# header: sample \t group \t /path/to/abundance.tsv