Skip to content

Instantly share code, notes, and snippets.

Avatar
📚
2020: reach out to discuss NLP over arXiv and math syntax

Deyan Ginev dginev

📚
2020: reach out to discuss NLP over arXiv and math syntax
View GitHub Profile
@dginev
dginev / examples.md
Created Jun 19, 2020
The many scientific meanings of superscript plus (TeX: {?}^+)
View examples.md

General form: base-superscript-+

meaning TeX field source
Math
Natural numbers from 1 \mathcal{N}^+ number theory wiki:Natural number
positive semi-orbit O^+, \gamma_{x}^{+} dynamical systems paper, wiki:Orbit
moduli space of trajectories M^+ Geometric Topology arxiv:1404.4561
orthogonal decomposition into eigenspace S^+ Geometric Topology arxiv:1404.4561
self-dual 2-forms on X \Lambda^+ X
@dginev
dginev / iterative_tikz.tex
Last active Mar 13, 2020
An example of a computational stress test for LaTeXML's tikz support
View iterative_tikz.tex
% source: arXiv 1002.3757
\documentclass{article}
\usepackage{tikz}
\newcommand\dw[2]{\draw[#1!#2,fill=#1!#2]}
\begin{document}
\begin{tikzpicture}
\def\LL{10}
\pgfmathsetmacro{\LLQ}{200/(\LL*\LL)}
@dginev
dginev / 4_gram_15_window_arxiv_cite.csv
Last active Feb 28, 2020
Top textual 4-grams within 15 words of an inline citation from arXiv (arXMLiv 08.2019)
View 4_gram_15_window_arxiv_cite.csv
4-gram frequency
see e g [cite] 340651
can be found in 197421
be found in [cite] 130873
see for example [cite] 93473
in the case of 86786
in the context of 80782
is given by [cite] 73337
shown in fig ref 65890
with respect to the 63965
@dginev
dginev / honest_solving.rs
Created Sep 28, 2019
Truth's Tricky Interfaces
View honest_solving.rs
trait HonestSolver {
fn solve(p: DecidableLogicalProposition) -> (bool, LogicalProof);
// i.e. First- and Higher-order logic
fn solve_if_possible(p: MathProposition) -> Option<(bool, MathProof)>;
fn try_to_solve_if_possible(p: ComputableProposition) -> Result<Option<(bool,ExecutionTrace)>, Box<dyn Error>>;
fn try_to_solve_socially_if_possible(p: LanguageProposition) -> Arc<Mutex<Result<Option<(bool,Justification)>, Box<dyn Error>>>>;
@dginev
dginev / arxiv_2019_headings_freq100.csv
Created Sep 21, 2019
Heading Statistics: arXMLiv 08.2019
View arxiv_2019_headings_freq100.csv
heading frequency
proof 2930621
lemma 1706821
theorem 1700430
references 1351260
abstract 1193933
introduction 1117555
proposition 1059776
definition 972999
remark 888243
@dginev
dginev / broken_tikz.svg
Created Sep 4, 2019
Broken Tikz Example from latexml/#1196
View broken_tikz.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View esum_iterative.tex
\newcount\index
\newcount\sum
\def\esum#1{
\index=#1
\sum=0
\loop
\advance\sum by \index
\ifnum\index>2
\advance\index by -2
@dginev
dginev / arxiv_headings_report.csv
Last active Aug 1, 2019
Most common headings from 1.2 million of arXiv documents (upto 08.2018)
View arxiv_headings_report.csv
heading frequency
proof 2464628
lemma 1380622
theorem 1254064
references 1213025
abstract 1057178
introduction 955218
proposition 876742
remark 694222
definition 686827
@dginev
dginev / subject_metadata.md
Created May 1, 2019
arXMLiv 08.2018 dataset, subject classification frequencies
View subject_metadata.md
Subject Document count
math 334932
astro-ph 223437
cond-mat 212384
cs 132338
hep-ph 130788
hep-th 116499
physics 99881
quant-ph 80888
@dginev
dginev / arxiv_metadata_packer.rs
Last active Apr 24, 2019
Extracting arXiv category metadata from OAI_PMHv2.0 xml harvest
View arxiv_metadata_packer.rs
//! Convert arXiv's OAI harvested XML files into a lookup table for classification labels
// Step 0. Prerequisite: download all needed arXiv metadata via OAI, e.g.
//```
// $ pip install git+http://github.com/bloomonkey/oai-harvest.git#egg=oaiharvest
// $ mkdir metadata/arxiv; cd metadata/arxiv
// $ oai-reg add arxiv http://export.arxiv.org/oai2?verb=Identify
// $ oai-harvest arxiv --until 2018-09-09
//```
// endpoint documentation at: https://arxiv.org/help/oa
use jwalk::WalkDir;
You can’t perform that action at this time.