Skip to content

Instantly share code, notes, and snippets.

View blockchainian's full-sized avatar

blockchainian

View GitHub Profile
@blockchainian
blockchainian / latency-optimization.md
Created November 2, 2025 15:54
Latency Optimization

Latency optimization

This guide covers the core set of principles you can apply to improve latency across a wide variety of LLM-related use cases. These techniques come from working with a wide range of customers and developers on production applications, so they should apply regardless of what you're building – from a granular workflow to an end-to-end chatbot!

While there's many individual techniques, we'll be grouping them into seven principles meant to represent a high-level taxonomy of approaches for improving latency.

At the end, we'll walk through an example to see how they can be applied.

@blockchainian
blockchainian / techniques_to_improve_reliability.md
Last active October 31, 2025 21:06
Techniques to improve reliability

Techniques to improve reliability

Theories of reliability

Although the techniques below vary in their approach, they all share the goal of improving reliability on complex tasks. Mainly they do this by:

  • decomposing unreliable operations into smaller, more reliable operations (e.g., selection-inference prompting)
  • using multiple steps or multiple relationships to make the system's reliability greater than any individual component (e.g., maieutic prompting)

Probabilistic graphical models

How to work with large language models

For more prompt examples, visit [OpenAI Examples][OpenAI Examples].

In general, the input prompt is the best lever for improving model outputs. You can try tricks like:

  • Be more specific (instruction): E.g., if you want the output to be a comma separated list, ask it to return a comma separated list. If you want it to say "I don't know" when it doesn't know the answer, tell it 'Say "I don't know" if you do not know the answer.' The more specific your instructions, the better the model can respond.
  • Prompt the model to write down the series of steps explaining its reasoning (CoT). If understanding the 'why' behind an answer is important, prompt the model to include its reasoning. This can be done by simply adding a line like "Let's think step by step" before each answer.
  • Provide Context (completion): Help the model understand the bigger picture of your request. This could be background information, examples/demonstr

<Short, action-oriented description>

This ExecPlan is a living document. The sections Progress, Surprises & Discoveries, Decision Log, and Outcomes & Retrospective must be kept up to date as work proceeds.

If PLANS.md file is checked into the repo, reference the path to that file here from the repository root and note that this document must be maintained in accordance with PLANS.md.

Purpose / Big Picture

Explain in a few sentences what someone gains after this change and how they can see it working. State the user-visible behavior you will enable.

Codex Execution Plans (ExecPlans):

This document describes the requirements for an execution plan ("ExecPlan"), a design document that a coding agent can follow to deliver a working feature or system change. Treat the reader as a complete beginner to this repository: they have only the current working tree and the single ExecPlan file you provide. There is no memory of prior plans and no external context.

How to use ExecPlans and PLANS.md

When authoring an executable specification (ExecPlan), follow PLANS.md to the letter. If it is not in your context, refresh your memory by reading the entire PLANS.md file. Be thorough in reading (and re-reading) source material to produce an accurate specification. When creating a spec, start from the skeleton and flesh it out as you do your research.

When implementing an executable specification (ExecPlan), do not prompt the user for "next steps"; simply proceed to the next milestone. Keep all sections up to date, add or split entries in the list at every stoppi

@blockchainian
blockchainian / training-data-ep49.md
Last active October 30, 2025 21:54
Effective Codex

From Autocomplete to Agents

Based on the interview, here are what work well:

  1. Write clear, scoped tasks
  • Define the goal clearly: what the PR should accomplish (e.g., “implement X in module Y, add tests”), success criteria.
  • Provide context: codebase, style guidelines, tests expected.
  • Provide examples of good Pull Requests / templates so agent learns your standards.
  1. Prepare your codebase