Skip to content

Instantly share code, notes, and snippets.

View jwdebelius's full-sized avatar

Justine Debelius jwdebelius

View GitHub Profile
import os
import biom
import numpy as np
import pandas as pd
def _parse_sample_lines(line):
"""
Extracts the sample name from the kraken identifier
"""
@jwdebelius
jwdebelius / clean_silva_taxonomy.py
Created September 16, 2019 09:35
A dirty script for tidying a silva taxonomy string into a Series
import pandas as pd
def tidy_taxon_silva(x):
"""
A very ugly script for cleaning taxonomy.
The script will take the string, and parse it into seven taxonomic levels
if they are avalaible. If lower levels are unavalaible (i.e. they could
not be classified accurately), then they will inheriet a designation
from the last classified level. Then, ambigious or uncultured organisms