Skip to content

Instantly share code, notes, and snippets.

View jyesselm's full-sized avatar

Joseph Yesselman jyesselm

View GitHub Profile
@nmwsharp
nmwsharp / printarr
Last active October 24, 2023 08:06
Pretty print tables summarizing properties of tensor arrays in numpy, pytorch, jax, etc. --- now on pip: `pip install arrgh`
Pretty print tables summarizing properties of tensor arrays in numpy, pytorch, jax, etc.
Now on pip! `pip install arrgh` https://github.com/nmwsharp/arrgh
def eterna_recalculated(row,scale_max=2.3):
"""Helper method that recalculates the Eterna score for an entry from a dataframe. It will then put the score back into the row. Please note that there is not a 1:1 correspondence between the actual and recalculated scores"""
assert len(row["target_structure"]) == len(row["sequence"])
# sometimes there is a fingerprint sequence at the end of the sturcutre, If that is the case it needs to be removed
sequence = re.sub("AAAGAAACAACAACAACAAC$","",row["sequence"])
# data_len is the number of data points that will be reviewed
data_len = min(
len(row["target_structure"]),
len(row["SHAPE_data"]), # can probably get rid of this one
len(sequence),
@bsweger
bsweger / useful_pandas_snippets.md
Last active April 19, 2024 18:04
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

@davfre
davfre / bamfilter_oneliners.md
Last active February 24, 2024 01:23
SAM and BAM filtering oneliners
@mwaskom
mwaskom / titanic_seaborn.ipynb
Last active February 8, 2024 13:25
Exploring the Kaggle Titanic dataset with seaborn.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@cbsmith
cbsmith / random_selection.cpp
Last active August 9, 2022 12:29
Hopefully serves as a reference implementation on how to do random selection of an element from a container.
// -*- compile-command: "clang++ -ggdb -o random_selection -std=c++0x -stdlib=libc++ random_selection.cpp" -*-
//Reference implementation for doing random number selection from a container.
//Kept for posterity and because I made a surprising number of subtle mistakes on my first attempt.
#include <random>
#include <iterator>
template <typename RandomGenerator = std::default_random_engine>
struct random_selector
{
//On most platforms, you probably want to use std::random_device("/dev/urandom")()
@0
0 / cols.txt
Created April 30, 2013 01:41
Brief awk tutorial
abc 1 2 3
def 4 5 6
ga 7 9 10
hij 1 5 99
@werediver
werediver / singleton.py
Created December 28, 2012 09:51
A thread safe implementation of singleton pattern in Python. Based on tornado.ioloop.IOLoop.instance() approach.
import threading
# Based on tornado.ioloop.IOLoop.instance() approach.
# See https://github.com/facebook/tornado
class SingletonMixin(object):
__singleton_lock = threading.Lock()
__singleton_instance = None
@classmethod
@aaronwolen
aaronwolen / sample_fastq.py
Created September 3, 2012 17:38
Random sample of fastq sequences from paired-end files
# Code written by brentp in response to BioStars question:
# http://www.biostars.org/post/show/6544/
import random
import sys
def write_random_records(fqa, fqb, N=100000):
""" get N random headers from a fastq file without reading the
whole thing into memory"""
records = sum(1 for _ in open(fqa)) / 4