Skip to content

Instantly share code, notes, and snippets.

2020: reach out to discuss NLP over arXiv and math syntax

Deyan Ginev dginev

2020: reach out to discuss NLP over arXiv and math syntax
View GitHub Profile
dginev /
Created Jun 19, 2020
The many scientific meanings of superscript plus (TeX: {?}^+)

General form: base-superscript-+

meaning TeX field source
Natural numbers from 1 \mathcal{N}^+ number theory wiki:Natural number
positive semi-orbit O^+, \gamma_{x}^{+} dynamical systems paper, wiki:Orbit
moduli space of trajectories M^+ Geometric Topology arxiv:1404.4561
orthogonal decomposition into eigenspace S^+ Geometric Topology arxiv:1404.4561
self-dual 2-forms on X \Lambda^+ X
dginev / iterative_tikz.tex
Last active Mar 13, 2020
An example of a computational stress test for LaTeXML's tikz support
View iterative_tikz.tex
% source: arXiv 1002.3757
dginev / 4_gram_15_window_arxiv_cite.csv
Last active Feb 28, 2020
Top textual 4-grams within 15 words of an inline citation from arXiv (arXMLiv 08.2019)
View 4_gram_15_window_arxiv_cite.csv
4-gram frequency
see e g [cite] 340651
can be found in 197421
be found in [cite] 130873
see for example [cite] 93473
in the case of 86786
in the context of 80782
is given by [cite] 73337
shown in fig ref 65890
with respect to the 63965
dginev /
Created Sep 28, 2019
Truth's Tricky Interfaces
trait HonestSolver {
fn solve(p: DecidableLogicalProposition) -> (bool, LogicalProof);
// i.e. First- and Higher-order logic
fn solve_if_possible(p: MathProposition) -> Option<(bool, MathProof)>;
fn try_to_solve_if_possible(p: ComputableProposition) -> Result<Option<(bool,ExecutionTrace)>, Box<dyn Error>>;
fn try_to_solve_socially_if_possible(p: LanguageProposition) -> Arc<Mutex<Result<Option<(bool,Justification)>, Box<dyn Error>>>>;
dginev / arxiv_2019_headings_freq100.csv
Created Sep 21, 2019
Heading Statistics: arXMLiv 08.2019
View arxiv_2019_headings_freq100.csv
heading frequency
proof 2930621
lemma 1706821
theorem 1700430
references 1351260
abstract 1193933
introduction 1117555
proposition 1059776
definition 972999
remark 888243
dginev / broken_tikz.svg
Created Sep 4, 2019
Broken Tikz Example from latexml/#1196
View broken_tikz.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View esum_iterative.tex
\advance\sum by \index
\advance\index by -2
dginev / arxiv_headings_report.csv
Last active Aug 1, 2019
Most common headings from 1.2 million of arXiv documents (upto 08.2018)
View arxiv_headings_report.csv
heading frequency
proof 2464628
lemma 1380622
theorem 1254064
references 1213025
abstract 1057178
introduction 955218
proposition 876742
remark 694222
definition 686827
dginev /
Created May 1, 2019
arXMLiv 08.2018 dataset, subject classification frequencies
Subject Document count
math 334932
astro-ph 223437
cond-mat 212384
cs 132338
hep-ph 130788
hep-th 116499
physics 99881
quant-ph 80888
dginev /
Last active Apr 24, 2019
Extracting arXiv category metadata from OAI_PMHv2.0 xml harvest
//! Convert arXiv's OAI harvested XML files into a lookup table for classification labels
// Step 0. Prerequisite: download all needed arXiv metadata via OAI, e.g.
// $ pip install git+
// $ mkdir metadata/arxiv; cd metadata/arxiv
// $ oai-reg add arxiv
// $ oai-harvest arxiv --until 2018-09-09
// endpoint documentation at:
use jwalk::WalkDir;
You can’t perform that action at this time.