Many Python scripts with parameters say M and N look like this:
# parameters --
M = 10
N = 10
...
myfunc( M, N ... )
""" heapq_del.py: heapq.py + heapchange heapdel | |
see http://docs.python.org/2/library/heapq.html | |
c heap* but python _siftup _siftdown, slow | |
""" | |
# cf. class Aheapq: heap --> A[ Arecs with .val .heappos ... ] | |
from heapq import heappush, heappop, heapify, heapreplace, _siftup, _siftdown | |
# http://docs.python.org/2/library/heapq.html | |
# https://github.com/python/cpython/blob/master/Modules/_heapqmodule.c | |
# (note _siftup --> leaves, _siftdown --> root |
""" trim outliers: smallest n values = next smallest, biggest n = next biggest | |
X, smallest, biggest = winsorize( X, n ) | |
-> X trimmed *inplace* | |
smallest: n+1 smallest | |
biggest: n+1 biggest | |
To winsorize the top 5 % and bottom 5 %, | |
winsorize( X, X.size * .05 ) | |
X may be 2d, 3d ... (but winsorizing each dimension separately would make more sense) | |
See also http://en.wikipedia.org/wiki/Winsorising |
#!/bin/sh | |
# less-or-grep.sh 2014-05-25 may denis | |
case $1 in -h* | --h* ) # help | |
exec cat <<! | |
To make files "live", e.g. ~/bin/todo ~/bin/books ... | |
add this line at the top: | |
source less-or-grep.sh # file -> less file, file 'pattern' -> egrep | |
and |
#!/usr/bin/env python | |
""" conjugate gradient cg( A, x, Res ) | |
in: A: an array, numpy dense or scipy.sparse | |
or sparse LinearOperator with .matvec .rmatvec | |
x: initial guess, 0 or ... | |
Res = Y - A x before calling cg() | |
iter: x += step | |
Res -= S *inplace* | |
from Jon F. Claerbout's nice, clear |
Interpolating e.g. wind or water temperature on a box grid in d dimensions usually looks at (order+1)^d neighbors of each point. Order 1, bilinear trilinear ... looks at all 2^d corners of the box around each data point; 2^d grows pretty fast. Order 0, though, looks at only 1 the nearest corner, so is blocky, discontinuous at box edges.
What's in between ? A simple method is
#!/usr/bin/env python2 | |
""" nanutil.py: corr, cosine similarity, linterpol ignoring missing data (NaNs) | |
A "NaN" is a marker for missing data, aka NA, Not Available, in numpy / pandas data arrays. | |
(Technically, it's a special, "sticky" floating point value, with e.g. | |
NaN + anything = NaN; see https://en.wikipedia.org/wiki/NaN .) | |
Numpy has a handful of functions that ignore NaNs, for example | |
`np.nanmean([ 1, np.NaN, 3 ]) == 2`. | |
Here are a few more, short and straightforward: |
Hamming, Digital Filters pages 66-69,
shows that "the frequency point of view ...
sheds a new light on classical numerical integration formulas."
The plot and code below show the frequency response of T (trapezoid), S (Simpson),
and some combinations like .2 T + .8 S
which give a good improvement, easily.
""" approximate derivatives for non-uniformly spaced data | |
deriv1, deriv2( x, t ) with t0 < t1 < ... | |
Very sensitive to noise, or delta-t near 0 | |
Spline, then take derivatives, is much better; see scipy.interpolate.splev( der= ) | |
""" | |
# http://google.com/search?q="finite-difference non-uniform site:stackexchange.com" | |
from __future__ import division | |
import numpy as np |