Skip to content

Instantly share code, notes, and snippets.

View evanroyrees's full-sized avatar

Evan Rees evanroyrees

View GitHub Profile
@evanroyrees
evanroyrees / taxonomy_alluvial_plot.py
Last active March 17, 2022 16:39
Alluvial plot generation of Autometa taxonomy information
#!/usr/bin/env python
"""
# Alluvial plot generation of taxonomy information
## Setup Env
First create env to run script
```bash
conda create -n plotly -c plotly -c conda-forge plotly python-kaleido pandas tqdm -y
@evanroyrees
evanroyrees / autometa.sh
Created October 25, 2021 16:39
Template slurm submission script to run autometa pipeline
#!/usr/bin/env bash
#SBATCH -p partition
#SBATCH -t 48:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
# NOTE: To create the conda environment for autometa you can supply the Makefile command:
# make create_environment
@evanroyrees
evanroyrees / autometa-large-data-mode.sh
Last active November 12, 2021 00:10
Template slurm submission script to run autometa-large-data-mode pipeline
#!/usr/bin/env bash
#SBATCH -p partition
#SBATCH -t 48:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
## First create environment to run Autometa, (and optionally GTDB-Tk and CheckM)
# git clone git@github.com:KwanLab/Autometa
# cd Autometa
# make create_environment
@evanroyrees
evanroyrees / table_munge_module_template.py
Created July 24, 2021 17:25
Module template for munging then writing out an input table
#!/usr/bin/env python
import argparse
import os
import pandas as pd
def do_stuff(df: pd.DataFrame) -> pd.DataFrame:
return df
@evanroyrees
evanroyrees / walkthrough_notes.md
Created July 7, 2021 20:58
Autometa dev walkthrough

Autometa pipeline walkthrough using Nextflow

NOTE: These instructions are for working off of the KwanLab/dev branch

Overview

  1. Install Autometa environment and commands
  2. Configure nextflow so Autometa commands can be run through your scheduler
  3. Configure run parameters (Set metagenome filepath and output directories)
  4. Run autometa pipeline using nextflow
@evanroyrees
evanroyrees / get_cluster_markers.py
Created February 10, 2021 22:35
Retrieve ORFs corresponding to markers for each cluster for Autometa v1.0 outputs
#!/usr/bin/env python
import argparse
import os
import glob
from Bio import SeqIO
import pandas as pd
@evanroyrees
evanroyrees / usage_script.py
Created April 6, 2020 16:36
Example temporary script calling only argparse for usage information
import argparse
import logging as logger
import multiprocessing as mp
logger.basicConfig(
format='%(asctime)s : %(name)s : %(levelname)s : %(message)s',
datefmt='%m/%d/%Y %I:%M:%S %p',
level=logger.DEBUG)
# This is pulled directly from autometa/common/coverage.py
parser = argparse.ArgumentParser(description='Construct contig coverage table given an input assembly and reads.')
@evanroyrees
evanroyrees / merge_nodes.py
Created February 20, 2020 00:41
Convert old taxids to new taxids given an autometa LCA output table and NCBI taxdump's merged.dmp file
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
merge nodes from autometa LCA output table before running `add_contig_taxonomy.py`
"""
import logging
import os