Skip to content

Instantly share code, notes, and snippets.

Deyan Ginev dginev

Block or report user

Report or block dginev

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@dginev
dginev / honest_solving.rs
Created Sep 28, 2019
Truth's Tricky Interfaces
View honest_solving.rs
trait HonestSolver {
fn solve(p: DecidableLogicalProposition) -> (bool, LogicalProof);
// i.e. First- and Higher-order logic
fn solve_if_possible(p: MathProposition) -> Option<(bool, MathProof)>;
fn try_to_solve_if_possible(p: ComputableProposition) -> Result<Option<(bool,ExecutionTrace)>, Box<dyn Error>>;
fn try_to_solve_socially_if_possible(p: LanguageProposition) -> Arc<Mutex<Result<Option<(bool,Justification)>, Box<dyn Error>>>>;
@dginev
dginev / arxiv_2019_headings_freq100.csv
Created Sep 21, 2019
Heading Statistics: arXMLiv 08.2019
View arxiv_2019_headings_freq100.csv
heading frequency
proof 2930621
lemma 1706821
theorem 1700430
references 1351260
abstract 1193933
introduction 1117555
proposition 1059776
definition 972999
remark 888243
@dginev
dginev / broken_tikz.svg
Created Sep 4, 2019
Broken Tikz Example from latexml/#1196
View broken_tikz.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View esum_iterative.tex
\newcount\index
\newcount\sum
\def\esum#1{
\index=#1
\sum=0
\loop
\advance\sum by \index
\ifnum\index>2
\advance\index by -2
@dginev
dginev / arxiv_headings_report.csv
Last active Aug 1, 2019
Most common headings from 1.2 million of arXiv documents (upto 08.2018)
View arxiv_headings_report.csv
heading frequency
proof 2464628
lemma 1380622
theorem 1254064
references 1213025
abstract 1057178
introduction 955218
proposition 876742
remark 694222
definition 686827
@dginev
dginev / subject_metadata.md
Created May 1, 2019
arXMLiv 08.2018 dataset, subject classification frequencies
View subject_metadata.md
Subject Document count
math 334932
astro-ph 223437
cond-mat 212384
cs 132338
hep-ph 130788
hep-th 116499
physics 99881
quant-ph 80888
@dginev
dginev / arxiv_metadata_packer.rs
Last active Apr 24, 2019
Extracting arXiv category metadata from OAI_PMHv2.0 xml harvest
View arxiv_metadata_packer.rs
//! Convert arXiv's OAI harvested XML files into a lookup table for classification labels
// Step 0. Prerequisite: download all needed arXiv metadata via OAI, e.g.
//```
// $ pip install git+http://github.com/bloomonkey/oai-harvest.git#egg=oaiharvest
// $ mkdir metadata/arxiv; cd metadata/arxiv
// $ oai-reg add arxiv http://export.arxiv.org/oai2?verb=Identify
// $ oai-harvest arxiv --until 2018-09-09
//```
// endpoint documentation at: https://arxiv.org/help/oa
use jwalk::WalkDir;
@dginev
dginev / corpus_statistics_ref.csv
Created Mar 30, 2019
"Words prior \ref", arXMLiv 08.2018
View corpus_statistics_ref.csv
word frequency
figure 3290488
theorem 3052607
section 2802295
lemma 2408488
table 1544961
proposition 1334759
and 1031640
corollary 476062
appendix 416964
@dginev
dginev / apply_cutoffs.pl
Last active Mar 24, 2019
arXMLiv 08.2018, MathML element report
View apply_cutoffs.pl
#!/usr/bin/env perl
# Applies cutoffs to the very noisy 250 MB mathml_statistics.txt
# which was generated by llamapun over arXMLiv 08.2018.
#
# It rewrites to a CSV file, throwing out all known erroneous markup, including:
# - discard all SVG-associated markup (wrongly in MathML)
# - discard all (non-math) HTML-associated markup (wrongly in MathML)
# - discard all XMath-associated markup (wrongly in MathML)
# - less noisy for uninteresting values (numbers with known units, hex colors, open-ended id schemes, etc)
#
@dginev
dginev / dlmf_mathml_report.csv
Created Mar 23, 2019
DLMF v0.1.20 MathML element report
View dlmf_mathml_report.csv
name@attr[value] frequency
mo 390704
mi 317263
mrow 265247
mi@href 230061
math@display 108952
math@class 108952
math 108952
math@alttext 108952
math@class[ltx_Math] 108944
You can’t perform that action at this time.