Skip to content

Instantly share code, notes, and snippets.

Deyan Ginev dginev

Block or report user

Report or block dginev

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View esum_iterative.tex
\newcount\index
\newcount\sum
\def\esum#1{
\index=#1
\sum=0
\loop
\advance\sum by \index
\ifnum\index>2
\advance\index by -2
@dginev
dginev / arxiv_headings_report.csv
Last active Aug 1, 2019
Most common headings from 1.2 million of arXiv documents (upto 08.2018)
View arxiv_headings_report.csv
heading frequency
proof 2464628
lemma 1380622
theorem 1254064
references 1213025
abstract 1057178
introduction 955218
proposition 876742
remark 694222
definition 686827
@dginev
dginev / subject_metadata.md
Created May 1, 2019
arXMLiv 08.2018 dataset, subject classification frequencies
View subject_metadata.md
Subject Document count
math 334932
astro-ph 223437
cond-mat 212384
cs 132338
hep-ph 130788
hep-th 116499
physics 99881
quant-ph 80888
@dginev
dginev / arxiv_metadata_packer.rs
Last active Apr 24, 2019
Extracting arXiv category metadata from OAI_PMHv2.0 xml harvest
View arxiv_metadata_packer.rs
//! Convert arXiv's OAI harvested XML files into a lookup table for classification labels
// Step 0. Prerequisite: download all needed arXiv metadata via OAI, e.g.
//```
// $ pip install git+http://github.com/bloomonkey/oai-harvest.git#egg=oaiharvest
// $ mkdir metadata/arxiv; cd metadata/arxiv
// $ oai-reg add arxiv http://export.arxiv.org/oai2?verb=Identify
// $ oai-harvest arxiv --until 2018-09-09
//```
// endpoint documentation at: https://arxiv.org/help/oa
use jwalk::WalkDir;
@dginev
dginev / corpus_statistics_ref.csv
Created Mar 30, 2019
"Words prior \ref", arXMLiv 08.2018
View corpus_statistics_ref.csv
word frequency
figure 3290488
theorem 3052607
section 2802295
lemma 2408488
table 1544961
proposition 1334759
and 1031640
corollary 476062
appendix 416964
@dginev
dginev / apply_cutoffs.pl
Last active Mar 24, 2019
arXMLiv 08.2018, MathML element report
View apply_cutoffs.pl
#!/usr/bin/env perl
# Applies cutoffs to the very noisy 250 MB mathml_statistics.txt
# which was generated by llamapun over arXMLiv 08.2018.
#
# It rewrites to a CSV file, throwing out all known erroneous markup, including:
# - discard all SVG-associated markup (wrongly in MathML)
# - discard all (non-math) HTML-associated markup (wrongly in MathML)
# - discard all XMath-associated markup (wrongly in MathML)
# - less noisy for uninteresting values (numbers with known units, hex colors, open-ended id schemes, etc)
#
@dginev
dginev / dlmf_mathml_report.csv
Created Mar 23, 2019
DLMF v0.1.20 MathML element report
View dlmf_mathml_report.csv
name@attr[value] frequency
mo 390704
mi 317263
mrow 265247
mi@href 230061
math@display 108952
math@class 108952
math 108952
math@alttext 108952
math@class[ltx_Math] 108944
@dginev
dginev / rustc.log
Created Jan 30, 2019
rtx_package$ time cargo rustc -- -Z time-passes
View rustc.log
time: 0.026; rss: 58MB parsing
time: 0.000; rss: 58MB attributes injection
time: 0.000; rss: 58MB garbage collect incremental cache directory
time: 0.000; rss: 58MB recursion limit
time: 0.000; rss: 58MB crate injection
time: 0.000; rss: 58MB plugin loading
time: 0.000; rss: 58MB plugin registration
time: 0.000; rss: 58MB background load prev dep-graph
time: 0.003; rss: 58MB pre ast expansion lint checks
time: 1.662; rss: 237MB expand crate
@dginev
dginev / custom_derive_lib.rs
Last active Jan 24, 2019
Contextual variable capture in Rust, via Custom Derive
View custom_derive_lib.rs
static mut CONTEXT_DEPTH: u32 = 0;
#[proc_macro_derive(BoundState)]
pub fn bound_state(_input: TokenStream) -> TokenStream {
let state_declaration = if unsafe {CONTEXT_DEPTH == 0} {
quote!(
macro_rules! state {
() => {
outer_state!()
};
@dginev
dginev / annual_dependency_status.csv
Last active Sep 19, 2018
arXiv 08.2018, LaTeX dependencies report
View annual_dependency_status.csv
We can't make this file beautiful and searchable because it's too large.
00,-4,amsbsy.sty.ltxml,1
00,-4,amsfonts.sty.ltxml,4
00,-4,amsmath.sty.ltxml,1
00,-4,amsopn.sty.ltxml,1
00,-4,amssymb.sty.ltxml,3
00,-4,amstext.sty.ltxml,1
00,-4,amsthm.sty.ltxml,1
00,-4,array.sty.ltxml,5
00,-4,article.cls.ltxml,13
00,-4,color.sty.ltxml,2
You can’t perform that action at this time.