Skip to content

Instantly share code, notes, and snippets.

View mohammedkhalfan's full-sized avatar
🏠
Working from home

mkhalfan mohammedkhalfan

🏠
Working from home
  • New York University
  • NYC
View GitHub Profile
@mohammedkhalfan
mohammedkhalfan / launch-nextflow.s
Created April 10, 2024 02:03
SBATCH script for nextflow jobs
#!/bin/sh
#
#SBATCH --verbose
#SBATCH --job-name=rnaseq
#SBATCH --output=rnaseq.o%j
#SBATCH --error=rnaseq.e%j
#SBATCH --time=72:00:00
#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=4
@mohammedkhalfan
mohammedkhalfan / downsample_fastqs_parallel
Created August 17, 2023 02:37
Takes a directory of .fastq.gz files and downsamples them according to downsample_ratio in parallel
import os
import random
import gzip
import sys
from multiprocessing import Pool
# Parameters
input_dir = sys.argv[1]
output_dir = os.path.join(input_dir, "small")
downsample_ratio = 0.01 # Replace with the desired downsample ratio
@mohammedkhalfan
mohammedkhalfan / samplesheet
Last active January 26, 2021 01:46
nf-core RNASEQ samplesheet example
group,replicate,fastq_1,fastq_2,strandedness
control,1,/path/to/S1_L002_R1_001.fastq.gz,/path/to/S1_L002_R2_001.fastq.gz,unstranded
control,2,/path/to/S2_L002_R1_001.fastq.gz,/path/to/S2_L002_R2_001.fastq.gz,unstranded
control,3,/path/to/S3_L002_R1_001.fastq.gz,/path/to/S3_L002_R2_001.fastq.gz,unstranded
treatment,1,/path/to/S4_L003_R1_001.fastq.gz,,unstranded
treatment,2,/path/to/S5_L003_R1_001.fastq.gz,,unstranded
treatment,3,/path/to/S6_L003_R1_001.fastq.gz,,unstranded
treatment,3,/path/to/S6_L004_R1_001.fastq.gz,,unstranded
@mohammedkhalfan
mohammedkhalfan / nextflow.config
Last active January 19, 2021 01:51
nf-core RNASEQ config template for CGSB
// Set these five params at a minimum
params.input = 'samplesheet.csv'
params.fasta = '/scratch/work/cgsb/genomes/Public/Vertebrate_mammalian/Homo_sapiens/Ensembl/GRCh38.p10/Homo_sapiens.GRCh38.dna.toplevel.fa'
params.gtf = '/scratch/work/cgsb/genomes/Public/Vertebrate_mammalian/Homo_sapiens/Ensembl/GRCh38.p10/Homo_sapiens.GRCh38.88.gtf'
out_root = '/scratch/netID/rnaseq_project'
params.email = 'netID@nyu.edu'
// Only make changes below if required
params.outdir = out_root + '/results'
workDir = out_root + '/nextflow_work'
@mohammedkhalfan
mohammedkhalfan / count-barcode-freq.py
Last active January 20, 2022 17:18
Takes a demultiplexed fastq file as input and returns sorted list of barcodes found in ascending order of frequency.
## Usage: python3 count-barcode-freq.py <fastq_file.gz>
## Example: python3 count-barcode-freq.py sample.fastq.gz
from operator import itemgetter
import sys, gzip
barcodes = {}
with gzip.open(sys.argv[1]) as fastq:
for line in fastq:
if not line.startswith(b'@'): continue
bc = line.decode("utf-8").split(':')[-1].strip()
{
"fasta": [{
"name": "Bowtie2 Index",
"extensions": [".1.bt2", ".2.bt2", ".3.bt2", ".4.bt2", ".rev.2.bt2", ".rev.1.bt2"],
"tags": "alignment, tophat, rnaseq",
"modules": ["bowtie2/intel/2.2.9"],
"command": "bowtie2-build $IN $IN",
"mem": 64
},
{
{
"Plant": {
"Arabidopsis_thaliana": {
"Ensembl": [{
"TAIR10": {
"dna": "ftp://ftp.ensemblgenomes.org/pub/plants/release-34/fasta/arabidopsis_thaliana/dna/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz",
"gff": "ftp://ftp.ensemblgenomes.org/pub/plants/release-34/gff3/arabidopsis_thaliana/Arabidopsis_thaliana.TAIR10.34.gff3.gz",
"gtf": "ftp://ftp.ensemblgenomes.org/pub/plants/release-34/gtf/arabidopsis_thaliana/Arabidopsis_thaliana.TAIR10.34.gtf.gz"
},
"TAIR9-example-other-version--remove-later": {
@mohammedkhalfan
mohammedkhalfan / jira_api_example.py
Last active March 23, 2017 21:11
JIRA API EXAMPLE
## Load required module
from jira.client import JIRA
## Step 1
def jira():
JIRA_SERVER="https://cbi.abudhabi.nyu.edu/jira"
key_cert='/path/to/private/key.pem'
with open(key_cert, 'r') as key_cert_file:
key_cert_data = key_cert_file.read()