Skip to content

Instantly share code, notes, and snippets.

View bede's full-sized avatar

Bede Constantinides bede

View GitHub Profile
@bede
bede / bioinformatics.patch
Created January 10, 2024 14:08
Necessary modifications to oup-authoring-template.tex for Oxford Bioinformatics submission
22,23c22,23
< \documentclass[unnumsec,webpdf,contemporary,large]{oup-authoring-template}%
< %\documentclass[unnumsec,webpdf,contemporary,large,namedate]{oup-authoring-template}% uncomment this line for author year citations and comment the above
---
> % \documentclass[unnumsec,webpdf,contemporary,large]{oup-authoring-template}%
> \documentclass[unnumsec,webpdf,contemporary,large,namedate]{oup-authoring-template}% uncomment this line for author year citations and comment the above
957,958c957,958
< %\bibliographystyle{abbrvnat}
< %\bibliography{reference}
---
@bede
bede / concat_by_barcode.py
Last active October 19, 2023 16:50
Concatenate demultiplexed ONT FASTQs by barcode (for one or more runs)
"""
Purpose: Concatenate demultiplexed FASTQs by barcode for one or more ONT runs
Usage: python concat_by_barcode.py run1/fastq_pass run2/fastq_pass
Author: Bede Constantinides
"""
import subprocess
import sys
from collections import defaultdict
@bede
bede / custom_check.py
Last active May 26, 2023 15:49
Pandera MWE – I want a single failure case when region_is_valid fails indicating the sample_name of the row that failed (cDNA-VOC-1-v4-1)
from io import StringIO
import pandas as pd
import pandera as pa
import pandera.extensions as extensions
from pandera.typing import Index, Series
csv_string = """
sample_name,country,region
cDNA-VOC-1-v4-1,USA,Bretagne
@bede
bede / split_summary_by_barcode.py
Created February 11, 2021 10:25
Split Guppy sequencing summaries by barcode
def split_summary_by_barcode(summary_path, out_dir, run_name):
'''Given a sequencing summary file path, write per barcode summaries to an output directory'''
dtypes = {
'filename_fastq': 'object',
'filename_fast5': 'object',
'read_id': 'object',
'run_id': 'category',
'channel': 'int64',
'mux': 'int64',
import pandas as pd
from bokeh.models.widgets import Select
from bokeh.layouts import widgetbox
from bokeh.models import ColumnDataSource, DataTable, TableColumn, CustomJS
from bokeh.io import show, output_file, output_notebook, reset_output
from bokeh.layouts import row, column, layout
raw_data = {'ORG': ['APPLE', 'ORANGE', 'MELON'],
'APPROVED': [5, 10, 15],
@bede
bede / cluster_df.py
Last active May 8, 2019 12:53
Distance matrix clustering
import pandas as pd
from scipy.spatial.distance import squareform
from scipy.cluster.hierarchy import fcluster, linkage
def cluster_df(df, method='single', threshold=100):
'''
Accepts a square distance matrix as an indexed DataFrame and returns a dict of index keyed flat clusters
Performs single linkage clustering by default, see scipy.cluster.hierarchy.linkage docs for others
'''
@bede
bede / Dockerfile
Created July 27, 2017 21:27
IVA Ubuntu Dockerfile
FROM ubuntu:16.04
RUN apt-get update && \
apt-get --yes install \
kmc smalt python3-pip zlib1g-dev libncurses5-dev libncursesw5-dev mummer samtools
RUN pip3 install iva
ENTRYPOINT ["iva"]
@bede
bede / simple_python3_parallelism.ipynb
Last active January 23, 2017 13:17
Simple Python3 parallelism
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@bede
bede / lists_generators_and_laziness.ipynb
Last active August 17, 2016 13:11
Generators vs. lists for sequence filtering
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@bede
bede / gist:fb8683029c6c10d4ad99a015c9fc8b7b
Created May 20, 2016 11:09
GKNO build failure - cat build_freebayes.*
$ cat build_freebayes.*
bgzf.c:44:1: warning: unused function 'kh_clear_cache' [-Wunused-function]
KHASH_MAP_INIT_INT64(cache, cache_t)
^
./khash.h:468:2: note: expanded from macro 'KHASH_MAP_INIT_INT64'
KHASH_INIT(name, uint64_t, khval_t, 1, kh_int64_hash_func, kh_int64_hash_equal)
^
./khash.h:140:21: note: expanded from macro 'KHASH_INIT'
static inline void kh_clear_##name(kh_##name##_t *h) \
^