Brando Miranda brando90

## maf_dual_backtranslation_self_improving.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                brando90
                / maf_dual_backtranslation_self_improving.md
            
            
              Created
              June 7, 2024 17:53
            
              
                MAF dual backtranslation self-improving loop
              
          
    Simplified version of Dually Ground BackTranslation for AutoFormalization:
def train_to_af_for_maf(mdl : causal_lm,
                        formal_data_set, # e.g., ITP lib like mathlib
                        informal_data_set,  # e.g., time-tested maths textbook e.g., Rudin, CLRS.
                        ):
    for (nl, fl*) in formal_data_set; for (nl*, fl) in informal_data_set;
        # -- Learn to Formalize: nl_i->fl* from fl* -> [nl_i]_i -> fl*
        [nl_i]_i := mdl("informalize " + fl*, sampling=top_p, num_out=k)  # noise is good for robustness!

  
## gist:23e96b1cf93372fb846b3e55aac162f1
from datasets import load_dataset
from transformers import GPT2Tokenizer, GPT2LMHeadModel, Trainer, TrainingArguments, DataCollatorForLanguageModeling

# Load dataset from a JSON file
data_files = {"train": "path/to/your/train.json", "test": "path/to/your/test.json"}
dataset = load_dataset("json", data_files=data_files)

# Load pre-trained GPT-2 tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

## helm_prompt.python
# -- HELM prompt, 8 shot, CoT? ref: https://storage.googleapis.com/crfm-helm-public/lite/benchmark_output/runs/v1.0.0/math:subject=algebra,level=1,use_official_examples=False,use_chain_of_thought=True,model=01-ai_yi-34b/scenario_state.json, https://crfm.stanford.edu/helm/lite/latest/#/runs/math:subject=algebra,level=1,use_official_examples=False,use_chain_of_thought=True,model=01-ai_yi-34b
HELM_MATH_PROMPT: str = (
"""Given a mathematics problem, determine the answer. Simplify your answer as much as possible.###
Problem: Let $r=3^s-s$ and $s=2^n+1$. What is the value of $r$ when $n=2$?
Answer: First substitute $n=2$ into the expression for $s$ to find $s=2^2+1=5$. Then substitute $s=5$ into the expression for $r$ to find $r=3^5-5=243-5=\\boxed{238}.###
Problem: If $x^{2y}= 4$ and $x = 4$, what is the value of $y$? Express your answer as a common fraction.
Answer: Plugging $x = 4$ into the first equation, we get $4^{2y} = 4^1 \\Rightarrow 2y = 1 \\Rightarrow y = \\boxed{\\frac{1}{2}}.###
Problem: If $y = \\dis

## minerva_prompt_cot.py
# -- Prompt minerva MATH 3 - better minerva + cot/scratch_pad
# https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/minerva_math/utils.py#L22
H_MATH_MINERVA_PROMPT_TEMPLATE_3_COT = (
r"""Problem:
Find the domain of the expression  $\frac{\sqrt{x-2}}{\sqrt{5-x}}$.}

Solution:
Let's think step by step. The expressions inside each square root must be non-negative. Therefore, $x-2 \ge 0$, so $x\ge2$, and $5 - x \ge 0$, so $x \le 5$. Also, the denominator cannot be equal to zero, so $5-x>0$, which gives $x<5$. Therefore, the domain of the expression is $\boxed{[2,5)}$.

Problem:

## gist:769e71f376b4a9a78ad63c6989be6a52
import math
from pantograph.server import Server, ServerError
from pantograph.expr import GoalState, TacticHave, TacticCalc, Tactic
from loguru import logger
from dataclasses import dataclass, field
from typing import Optional, List, Tuple

@dataclass(frozen=True)
class SearchResult:
    """The result of attempting to prove a theorem."""

## vectoring_lowering_uncertainty_research_planning.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                brando90
                / vectoring_lowering_uncertainty_research_planning.md
            
            
              Created
              April 25, 2024 17:45
            
              
                vectoring_lowering_uncertainty_research_planning.md
              
          
    ---- Prompts for Assessing & Rubber Ducking discussions on your research plan according to Vectoring  ----

references:
- https://web.stanford.edu/class/cs197/slides/04-vectoring.pdf
We should have three prompts. One tackling a different way to mitigate risks and uncertainties in projects with
uncertainties -- especially in computer science and machine learning research.
They are:

Prompt 1: attempting to guess/extrapolate/identifying unknown unknowns -- to reduce risks we are unaware of.
Prompt 2: asses my vectoring plan with assumptions for effective tackling of the unknown i.e., research


## gist:57308c42004c333ad36f9cb3731530b5
@inproceedings{langley00,
 author    = {P. Langley},
 title     = {Crafting Papers on Machine Learning},
 year      = {2000},
 pages     = {1207--1216},
 editor    = {Pat Langley},
 booktitle     = {Proceedings of the 17th International Conference
              on Machine Learning (ICML 2000)},
 address   = {Stanford, CA},
 publisher = {Morgan Kaufmann}

## gist:0323f9e2a659580978ec5daa72f41333
/-
theorem: lim_{x -> c+} f(x) = +infinity

x + infinit = +infinity

lim_{x -> c} f(x) = L
∀ ε > 0, ∃ δ > 0, 0 < |x - c| < δ → 0 < |f(x) - L| < ε

L = + infinity
consider some ε > 0

## gist:0786c13414a510d4652351a2d96b51fc
/-
-/
import Mathlib.Data.Real.Basic

-- define 1/x (reciprical) for reals
noncomputable def f (x : ℝ ):  ℝ := x⁻¹
#check f

-- unit test that f 1 = 1, f 2 = 1/2
theorem test_f1 : f 1 = 1 := by simp[f]

## gist:69c4ebf764d914e5940a86ca374d3e75
**Do not** run `install.sh` blindly.
There is no guarantee it will work on your machine.
Therefore, understand each command and only run the next command if the current one succeeded.

## For developing Lean in this repo

If you git cloned this repo with say `git clone git@github.com:brando90/learning_lean.git` then you will have the lean project `lean_src_proj` folder and it won't have it's Lean depedencies e.g., Mathlib or the `.lake` folder will be missing.

If that is the case, then you need to install mathlib for this project (note doing `lake +leanprover/lean4:nightly-2023-02-04 new my_project math` as suggested by the Lean [projects setup will fail](https://leanprover-community.github.io/install/project.html#creating-a-lean-project) does **not work**) with:
```bash
	from datasets import load_dataset
	from transformers import GPT2Tokenizer, GPT2LMHeadModel, Trainer, TrainingArguments, DataCollatorForLanguageModeling

	# Load dataset from a JSON file
	data_files = {"train": "path/to/your/train.json", "test": "path/to/your/test.json"}
	dataset = load_dataset("json", data_files=data_files)

	# Load pre-trained GPT-2 tokenizer and model
	tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
	model = GPT2LMHeadModel.from_pretrained("gpt2")
	# -- HELM prompt, 8 shot, CoT? ref: https://storage.googleapis.com/crfm-helm-public/lite/benchmark_output/runs/v1.0.0/math:subject=algebra,level=1,use_official_examples=False,use_chain_of_thought=True,model=01-ai_yi-34b/scenario_state.json, https://crfm.stanford.edu/helm/lite/latest/#/runs/math:subject=algebra,level=1,use_official_examples=False,use_chain_of_thought=True,model=01-ai_yi-34b
	HELM_MATH_PROMPT: str = (
	"""Given a mathematics problem, determine the answer. Simplify your answer as much as possible.###
	Problem: Let $r=3^s-s$ and $s=2^n+1$. What is the value of $r$ when $n=2$?
	Answer: First substitute $n=2$ into the expression for $s$ to find $s=2^2+1=5$. Then substitute $s=5$ into the expression for $r$ to find $r=3^5-5=243-5=\\boxed{238}.###
	Problem: If $x^{2y}= 4$ and $x = 4$, what is the value of $y$? Express your answer as a common fraction.
	Answer: Plugging $x = 4$ into the first equation, we get $4^{2y} = 4^1 \\Rightarrow 2y = 1 \\Rightarrow y = \\boxed{\\frac{1}{2}}.###
	Problem: If $y = \\dis
	# -- Prompt minerva MATH 3 - better minerva + cot/scratch_pad
	# https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/minerva_math/utils.py#L22
	H_MATH_MINERVA_PROMPT_TEMPLATE_3_COT = (
	r"""Problem:
	Find the domain of the expression $\frac{\sqrt{x-2}}{\sqrt{5-x}}$.}

	Solution:
	Let's think step by step. The expressions inside each square root must be non-negative. Therefore, $x-2 \ge 0$, so $x\ge2$, and $5 - x \ge 0$, so $x \le 5$. Also, the denominator cannot be equal to zero, so $5-x>0$, which gives $x<5$. Therefore, the domain of the expression is $\boxed{[2,5)}$.

	Problem:
	import math
	from pantograph.server import Server, ServerError
	from pantograph.expr import GoalState, TacticHave, TacticCalc, Tactic
	from loguru import logger
	from dataclasses import dataclass, field
	from typing import Optional, List, Tuple

	@dataclass(frozen=True)
	class SearchResult:
	"""The result of attempting to prove a theorem."""
	@inproceedings{langley00,
	author = {P. Langley},
	title = {Crafting Papers on Machine Learning},
	year = {2000},
	pages = {1207--1216},
	editor = {Pat Langley},
	booktitle = {Proceedings of the 17th International Conference
	on Machine Learning (ICML 2000)},
	address = {Stanford, CA},
	publisher = {Morgan Kaufmann}
	/-
	theorem: lim_{x -> c+} f(x) = +infinity

	x + infinit = +infinity

	lim_{x -> c} f(x) = L
	∀ ε > 0, ∃ δ > 0, 0 < \|x - c\| < δ → 0 < \|f(x) - L\| < ε

	L = + infinity
	consider some ε > 0
	/-
	-/
	import Mathlib.Data.Real.Basic

	-- define 1/x (reciprical) for reals
	noncomputable def f (x : ℝ ): ℝ := x⁻¹
	#check f

	-- unit test that f 1 = 1, f 2 = 1/2
	theorem test_f1 : f 1 = 1 := by simp[f]
	Do not run `install.sh` blindly.
	There is no guarantee it will work on your machine.
	Therefore, understand each command and only run the next command if the current one succeeded.

	## For developing Lean in this repo

	If you git cloned this repo with say `git clone git@github.com:brando90/learning_lean.git` then you will have the lean project `lean_src_proj` folder and it won't have it's Lean depedencies e.g., Mathlib or the `.lake` folder will be missing.

	If that is the case, then you need to install mathlib for this project (note doing `lake +leanprover/lean4:nightly-2023-02-04 new my_project math` as suggested by the Lean [projects setup will fail](https://leanprover-community.github.io/install/project.html#creating-a-lean-project) does not work) with:
	```bash