Skip to content

Instantly share code, notes, and snippets.

View thanhleviet's full-sized avatar
🤗
Focusing

Thanh Lee thanhleviet

🤗
Focusing
View GitHub Profile
@thanhleviet
thanhleviet / merge_biom.py
Last active September 26, 2023 15:47
Python script for meging multiple biom files and write the merged file out in BIOM 1.0 format
#!/usr/bin/env python
import argparse
import sys
import os
import pandas as pd
from biom import load_table, Table
def merge_biom_files_with_pandas(biom_files):
dfs = [] # List to store DataFrames
@thanhleviet
thanhleviet / job_submit.lua
Created June 21, 2023 17:40 — forked from treydock/job_submit.lua
SLURM job_submit lua script
--[[
SLURM job submit filter for QOS
Some code and ideas pulled from https://github.com/edf-hpc/slurm-llnl-misc-plugins/blob/master/job_submit.lua
--]]
--########################################################################--
--
@thanhleviet
thanhleviet / prepare_sample_sheet.py
Last active May 3, 2023 20:52
Simple python script for scanning PE files based on a list of patterns and write to a csv file with three columns: sample_id, forward, reverse
import os
import csv
import argparse
def scan_paired_end_files(dir_path, pattern_list):
# Initialize a list to store the sample IDs and file paths
samples = []
# Loop through all files in the directory
for file_name in os.listdir(dir_path):
@thanhleviet
thanhleviet / romano.r
Created March 13, 2023 14:34
test patchwork
library(patchwork)
library(ggplot2)
library(tidyverse)
data(mtcars)
data(iris)
mtcars.tf <- mtcars %>%
rownames_to_column("name") %>%
pivot_longer(names_to = "metrics", values_to = "values", mpg:carb)
@thanhleviet
thanhleviet / singularity-ce.yml
Created July 1, 2022 13:00
Ansible script to install singularity deb
- hosts: all
become: true
become_user: root
vars:
singularity_version: 3.10.0
pre_tasks:
- name: Update apt packages
apt:
update_cache: true
cache_valid_time: 86400 #One day
@thanhleviet
thanhleviet / download_pubmlst_contigs.py
Created March 14, 2022 13:56
download (Campylobacter) genomes from pubmlst.org in parallel
#!/usr/bin/env python
import requests
import time
import pathlib
from joblib import Parallel, delayed
from pqdm.processes import pqdm
def url_template(id):
url0 = "https://pubmlst.org/bigsdb?db=pubmlst_campylobacter_isolates&page=plugin&name=Contigs&format=text&isolate_id="
url1 = "&match=1&pc_untagged=0&min_length=&header=1"
@thanhleviet
thanhleviet / ggtree_plot.r
Created December 8, 2021 15:49
Colour tips by sample
library(dplyr)
library(ggtree)
tree <- read.tree("Benchmarking_tree_07Dec21.nwk")
tips <- tree$tip.label %>%
tibble(tip = .) %>%
mutate(sample = gsub("_[a-zA-Z\\_]+","",tip))
@thanhleviet
thanhleviet / print_sample_sheet.nf
Created October 6, 2021 12:24
Parse and print records in a csv with nextflow
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
ch_pilon = Channel.fromPath(params.sample_sheet)
.splitCsv(header: true)
.map {row -> tuple(row.sample_id,[row.sr1,row.sr2],row.contigs)}
ch_pilon.view()
@thanhleviet
thanhleviet / eben.r
Created July 22, 2021 17:28
R script to manipulate data
library(tidyverse)
library(data.table)
library(janitor)
csv <- fread("Ebenn_code_data_21Jul21_18.08.csv")
features <- names(csv)[-c(1,2,4)]
sample_names <- csv$Name %>%
gsub("_flye_[a-z\\_]*|_hybrid","",.) %>%