plastex — largish, last commit 1y ago, python- in addition to requirements.txt also
$ pip install Pillow Unidecode
for less noisy output - TEST: neat output when it works, but failed on all tested arXiv papers and it's own documentation
- in addition to requirements.txt also
- TexSoup — smallish, recent commits, python
- doesn't handle definitions (
\dev
) source - stackoverflow question specifically asking about arXiv TeX (by someone using TeXSoup)
- TEST: works on most tested; fails with e.g.
1607.00138
- doesn't handle definitions (
- find citations
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" Utility functions. | |
""" | |
import json | |
import re | |
import time | |
from aqt import mw | |
from aqt.utils import chooseList | |
from anki.utils import stripHTML |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from scipy.optimize import curve_fit | |
from matplotlib import pyplot as plt | |
# checkpoints | |
x = np.array([0,1,2]) | |
# delta in days | |
y = np.array([115,489,1020]) | |
fit_e = curve_fit(lambda t, a, b, c: a+b*np.exp(c*t), x, y) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" Draw pitch accent patterns in SVG | |
Example: | |
python3 draw_pitch.py はな 010 > 花.html | |
firefox 花.html | |
""" | |
import sys |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- 8808 citation contexts, 100 words each | |
- ground truth: | |
given a document, for all its FoS in PaperFieldsOfStudy.txt | |
get highest level FoS by FieldsOfStudyChildren.txt | |
- prediction: | |
for all annotated FoS | |
get highest level FoS by FieldsOfStudyChildren.txt | |
- accuracy: | |
if the intersection of ground truth and prediction is non empty: 1 | |
otherwise: 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" Migrate JSON documents from JSONkeeper 2017-12 to the most recent version | |
""" | |
import configparser | |
import firebase_admin | |
import json | |
import os | |
import requests | |
import sys | |
from firebase_admin import auth as firebase_auth |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!doctype html> | |
<html> | |
<head> | |
<meta charset="utf-8"> | |
<meta name="viewport" content="width=device-width, initial-scale=1"> | |
<script src="https://cdn.firebase.com/libs/firebaseui/2.5.1/firebaseui.js"></script> | |
<link type="text/css" rel="stylesheet" href="https://cdn.firebase.com/libs/firebaseui/2.5.1/firebaseui.css" /> | |
</head> | |
<body> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"facets": [ | |
{ | |
"label": "テーマ", | |
"value": [ | |
{ | |
"label": "顔貌", | |
"value": 5835, | |
"agent": "human" | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[shared] | |
db_uri = sqlite:////home/tarek/repos/canvasindexer/canvasindexer/index.db | |
[crawler] | |
as_sources = http://localhost/JSONkeeper/as/collection.json | |
[api] | |
facet_label_sort_top = テーマ,性別,向き,制作年,所蔵,原典,原典ID | |
facet_label_sort_bottom = tag | |
facet_value_sort_frequency = テーマ,性別,身分,向き,所蔵,原典 | |
facet_value_sort_alphanum = 制作年,原典ID | |
[facet_value_sort_custom_1] |
NewerOlder