The dates on a note are when that note was started, not finished.
My process for writing the following notes was as follows:
- First, I went to the Agent Foundations conference at CMU. I gave a talk on my most recent work relating to the tiling problem (AKA reflective consistency, AKA self-trust). Scott Garrabrant got excited about an idea for improving on my theorems with "communication" between UDT instances.
- I began pursuing the idea. I got a strong initial idea that there should be some sort of proof that should work. However, I didn't work it out seriously for some time.
- Eventually, I started working on it seriously. At around the same time, Claude 4 came out, so I made it an experiment to see how helpful Claude 4 could be with fleshing out the details of the theorem.
- I started with a rough, very informal (and ultimately incorrect) proof sketch, which helped me focus my intuitions. I then tried to start a draft of a formal version of the proof. I discarded drafts and started over several times,