Skip to content

Instantly share code, notes, and snippets.

@saketkc
saketkc / TEST.rb
Created July 14, 2011 06:37
CodeChef(SPOJ) Problem1 Ruby Solution
a=[]
while STDIN.readline.chomp!="42"
a.push($_)
end
a.each { |s| puts s }
@brantfaircloth
brantfaircloth / get_protein.py
Created April 3, 2011 23:50
Get protein sequences from Genbank given a genomic accession number and a gene name
import sys
import time
from Bio import Entrez
Entrez.email = "your.email@domain.tld"
if not Entrez.email:
print "you must add your email address"
sys.exit(2)
# create an empty list we will fill with the gene names
# Inspired by the following sentence that I ran across this morning:
#
# "f_lineno is the current line number of the frame - writing to
# this from within a trace function jumps to the given line
# (only for the bottom-most frame). A debugger can implement a
# Jump command (aka Set Next Statement) by writing to f_lineno."
#
# https://docs.python.org/2/reference/datamodel.html
#
# There is an older implementation of a similar idea:
@twiecki
twiecki / dask_sparse_corr.py
Created August 17, 2018 11:26
Compute large, sparse correlation matrices in parallel using dask.
import dask
import dask.array as da
import dask.dataframe as dd
import sparse
@dask.delayed(pure=True)
def corr_on_chunked(chunk1, chunk2, corr_thresh=0.9):
return sparse.COO.from_numpy((np.dot(chunk1, chunk2.T) > corr_thresh))
def chunked_corr_sparse_dask(data, chunksize=5000, corr_thresh=0.9):
@fperez
fperez / README.md
Last active July 1, 2021 04:43
Polyglot Data Science with IPython

Polyglot Data Science with IPython & friends

Author: Fernando Pérez.

A demonstration of how to use Python, Julia, Fortran and R cooperatively to analyze data, in the same process.

This is supported by the IPython kernel and a few extensions that take advantage of IPython's magic system to provide low-level integration between Python and other languages.

See the companion notebook for data preparation and setup.

@brentp
brentp / one-channel-agilent.R
Created August 17, 2011 22:58
use limma to normalize 1-channel agilent data and write out differentially expressed genes.
library(limma)
GROUP="62976"
# targets.txt has columns of "FileName" and "Condition" e.g.
"""
FileName Condition
data/scrubbed/LT001098RU_COPD.45015.txt COPD
data/scrubbed/LT001600RL_ILD.45015.txt ILD
data/scrubbed/LT003990RU_CTRL.45015.txt CTRL
data/scrubbed/LT004173LL_ILD.45015.txt ILD
@dgrtwo
dgrtwo / mnist_pairs.R
Created May 31, 2017 18:56
Comparing pairs of MNIST digits based on one pixel
library(tidyverse)
# Data is downloaded from here:
# https://www.kaggle.com/c/digit-recognizer
kaggle_data <- read_csv("~/Downloads/train.csv")
pixels_gathered <- kaggle_data %>%
mutate(instance = row_number()) %>%
gather(pixel, value, -label, -instance) %>%
extract(pixel, "pixel", "(\\d+)", convert = TRUE)
@lmcinnes
lmcinnes / flow_cytometry.ipynb
Created September 8, 2018 22:19
Flow Cytometry experiments with UMAP
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mblondel
mblondel / kernel_kmeans.py
Last active January 4, 2024 11:45
Kernel K-means.
"""Kernel K-means"""
# Author: Mathieu Blondel <mathieu@mblondel.org>
# License: BSD 3 clause
import numpy as np
from sklearn.base import BaseEstimator, ClusterMixin
from sklearn.metrics.pairwise import pairwise_kernels
from sklearn.utils import check_random_state
@fperez
fperez / ProgrammaticNotebook.ipynb
Last active April 5, 2024 12:00
Creating an IPython Notebook programatically
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.