Skip to content

Instantly share code, notes, and snippets.

@jelc53
Last active May 8, 2023 03:06
Show Gist options
  • Save jelc53/6298cc8df392e6da120f0100f7f378f1 to your computer and use it in GitHub Desktop.
Save jelc53/6298cc8df392e6da120f0100f7f378f1 to your computer and use it in GitHub Desktop.
runsheet for coqgym's astactic evaluation
# checkout main git branch
git pull
git checkout master
# download and scp coqgym's astactic best model into model directory
# https://drive.google.com/drive/folders/1AzLaEpoGS3BPMUz9Bl63MHAFRqlF4CtH
scp model.pth [username]@scq2.stanford.edu:/scr2/[user_dir]/CoqGym-GNN/ASTactic/runs/astactic/checkpoints/model.pth
# evaluation runsheet (5 left to go!)
"weak-up-to", -- done
"buchberger", -- done
"jordan-curve-theorem", -- done
"dblib", -- done
"disel", -- done
"zchinese", -- done
"zfc", -- done
"dep-map", -- done
"chinese", -- done
"UnifySL", -- done
"hoare-tut", -- done
"huffman", -- done
"PolTac", -- done
"angles", -- done
"coq-procrastination", -- done
"coq-library-undecidability", -- done
"tree-automata",
"coquelicot", -- done
"fermat4", -- done (skipped gcd2_relp_odd)
"demos",
"coqoban", -- done
"goedel", -- done
"verdi-raft",
"verdi",
"zorns-lemma", -- done
"coqrel", -- done
"fundamental-arithmetics"
# version 1: run script for single proof library evaluation
python evaluate.py ours astactic --path runs/astactic/checkpoints/model.pth --filter [proof_library] >> log
# version 2: multiprocess run script for evaluation of one or many proof libraries
# open multiprocess_test_astactic.py and edit list of proof libraries
python multiprocess_test_astactic.py astactic runs/astactic/checkpoints/model.pth
# optional: you may want to add code to evaluate.py and agent.py
# to skip proof libraries or specific proofs within those libraries
python evaluate.py ... --skip "lalcii acii [...]"
# current status of evaluation is shared on our shared gdrive in evaluation/test_astactic.tgz
# https://drive.google.com/file/d/1aD8BR87Q2iUaMmFZWoAfmATx-Tm3uSX8/view?usp=share_link
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment