This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using AMDB | |
using DataStructures | |
using GZip | |
using UnicodePlots | |
using IntervalSets | |
# if we want to pass around code, extend it with your own machine name | |
# (gethostname# ()) and path | |
AMDB_path = get(Dict("tamas" => | |
"/home/tamas/research/AMDB/data/AMDB_subsample.jls.gz"), |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using AMDB | |
using DataStructures | |
using UnicodePlots | |
using IntervalSets # need to Pkg.checkout the latest version | |
# i made a new function to read the data, replace path for your own machine | |
records = deserialize_gz(expanduser("~/research/AMDB/data/AMDB_subsample.jls.gz")) | |
# suppose we are counting in 2005 | |
date_interval = Date(2005,1,1)..Date(2005,12,31) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using AMDB | |
using DataStructures | |
using IntervalSets | |
""" | |
Call `f` with each AMP spell in `records`. Useful for accumulators. | |
""" | |
function traverse_AMP_spells(f, records) | |
for data in values(records) | |
foreach(f, data.AMP_spells) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using AMDB | |
using DataStructures | |
using IntervalSets | |
using Plots | |
plotlyjs() # I like this backend, feel free to change it | |
records = deserialize_gz(expanduser("~/Documents/Thesis/AMDB_subsample.jls.gz")) | |
function spell_durations_in_year(records, interval) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using AMDB | |
using DataStructures | |
using IntervalSets | |
using Plots | |
using DataFrames # you may need to add these libraries | |
using StatPlots # with Pkg.add | |
records = deserialize_gz(expanduser("~/research/AMDB/data/AMDB_subsample.jls.gz")) | |
plotly() # this makes plots appear in your browser, you can use other backends |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
\documentclass{beamer} | |
\usepackage{amsmath} | |
\usepackage{amsfonts} | |
\usepackage{booktabs} | |
\usepackage{mathtools} | |
\mathtoolsset{showonlyrefs} | |
%% no idiotic nav buttons | |
\setbeamertemplate{navigation symbols}{} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
###################################################################### | |
# MWE for tabulating large CSV files, using plain Julia and IterateTables | |
# See the discussion at | |
# https://discourse.julialang.org/t/iterate-delimited-file-as-namedtuples/3616/3 | |
# | |
# Some simplifications: | |
# - using the index of iterated tuples to make it comparable, | |
# but would prefer to use the column name in real code. | |
# - not using compressed files (because CSV does not support Zlib streams ATM) | |
###################################################################### |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Lag-`k` autocorrelation of `x` from a variogram. `v` is the variance | |
of `x`, used when supplied. | |
""" | |
function ρ(x, k, v=var(x)) | |
x1 = @view(x[1:(end-k)]) | |
x2 = @view(x[(1+k):end]) | |
V = sum((x1 .- x2).^2)/length(x1) | |
1-V/(2*v) | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# consistent random numbers | |
srand(UInt32[0xfd909253, 0x7859c364, 0x7cd42419, 0x4c06a3b6]) | |
""" | |
err(x, [prec]) | |
Return two values, which are the log2 relative errors for calculating | |
`log1p(x)`, using `Base.log1p` and `Base.Math.JuliaLibm.log1p`. | |
The errors are calculated by compating to `BigFloat` calculations with the given |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
###################################################################### | |
# context: reading UInt8 lines from a gzipped stream | |
# - the real dataset has about 5e10 lines, this is a self-contaned MWE | |
# - the real dataset lines are then processed, this MEW is just about optimizing reading | |
# - line length can be bounded (relevant for buffered reading) | |
###################################################################### | |
using CodecZlib | |
""" |