Skip to content

Instantly share code, notes, and snippets.

View klprint's full-sized avatar

Kevin klprint

View GitHub Profile
# Analyse data using a sliding window
slideFunct <- function(data, window, step){
total <- length(data)
spots <- seq(from=1, to=(total-window), by=step)
result <- vector(length = length(spots))
for(i in 1:length(spots)){
result[i] <- median(data[spots[i]:(spots[i]+window)])
}
return(result)
}
@klprint
klprint / gist:5b10ff18c91f26d890631a8780faffe1
Created August 1, 2016 07:12 — forked from fajrif/gist:1265203
git clone specific tag
git clone <repo-address>
git tag -l
git checkout <tag-name>
git branch -D master
git checkout -b master
@klprint
klprint / ChangeChmod.txt
Created August 1, 2016 08:18
Change chmod for folder & all subfolders
# Sets chmod 755 for folder and all subfolders
find /opt/lampp/htdocs -type d -exec chmod 755 {} \;
# Sets chmod 655 for all files in this folder & subfolder
find /opt/lampp/htdocs -type f -exec chmod 644 {} \;
@klprint
klprint / ChangeChmod.txt
Created August 1, 2016 08:18
Change chmod for folder & all subfolders
# Sets chmod 755 for folder and all subfolders
find /opt/lampp/htdocs -type d -exec chmod 755 {} \;
# Sets chmod 655 for all files in this folder & subfolder
find /opt/lampp/htdocs -type f -exec chmod 644 {} \;
@klprint
klprint / FastGrep.py
Created August 9, 2016 10:39
Searches in files with the same extension for sequences being provided by a fasta file. The script picks 15-16 base-pairs in the middle of each sequence in the fasta file and uses grep to find the number of occurrences.
import subprocess
import glob
# Functions
def read_multifasta(file_path):
is_entry = False
fasta_dict = {}
sequence = []
@klprint
klprint / check_file.py
Created September 30, 2016 08:59
This python function checks whether a file exists. If not, the function waits for a user specified time and checks again. As soon as the file appears, a reaction can be specified. A log is written while the function runs to inform the user about the current status.
def check_output(file, interval_time, logfile_path):
import os
import time
status = os.path.isfile(file)
while status is not True:
f_log = open(logfile_path, 'a')
f_log.write(time.asctime() + '\t' + 'Still running\n')
@klprint
klprint / TMHMM_parser.py
Last active December 14, 2016 12:20
This script parses TMHMM short output to a full-length topology output.
#################################################################
######################## TMHMM Parser ###########################
#################################################################
# Created by: Kevin Leiss
# Last Updated: 14.12.2016
#
# License: Feel free to use the script, but please refer to me if
# you used it for publication.
#
@klprint
klprint / parse_10x_output.sh
Created June 4, 2018 11:53
Parse a 10x chromium sparse matrix output into a single file, inserting the ENSEMBL gene ID and the cell barcode
#!/bin/bash
# The following script parses the 10x chromoium sparse matrix.
# It replaces the First column with the ENSEMBL gene ID and the second,
# if needed, with the cell barcode (just uncomment the second awk script).
# It needs the three 10x chromium outputs as follows:
# 1. genes.tsv
# 2. matrix.mtx
# 3. barcodes.tsv
library(rlc)
library(matrixStats)
library(biomaRt)
library(Matrix)
library(umap)
#sobj <- readRDS("make_analysis_out/SN010_E115/SN010_E115_normalized.rds")
makeENSMEBLlink <- function(geneID){
sprintf(
"<a href='http://www.ensembl.org/Mus_musculus/Gene/Summary?db=core;g=%s' target='_blank'>%s</a>",
@klprint
klprint / single_cell_functions.R
Last active January 30, 2020 12:45
Widely used single cell functions in R
# Taken from Simon
# compute variances across column or rows for column-sparse matrices
library(Matrix)
colVars_spm <- function( spm ) {
stopifnot( is( spm, "dgCMatrix" ) )
ans <- sapply( seq.int(spm@Dim[2]), function(j) {
mean <- sum( spm@x[ (spm@p[j]+1):spm@p[j+1] ] ) / spm@Dim[1]
sum( ( spm@x[ (spm@p[j]+1):spm@p[j+1] ] - mean )^2 ) +
mean^2 * ( spm@Dim[1] - ( spm@p[j+1] - spm@p[j] ) ) } ) / ( spm@Dim[1] - 1 )
names(ans) <- spm@Dimnames[[2]]