lemonteaa/Prompt_IDE.md

## Prompt_IDE.md

      
    Raw
  

              Prompt_IDE.md
            
          
    Prompt IDE Idea

Goal and Non-goal

Not to become an enterprisy tool - so probably not fancy collaboration or auto-eval by LLM itself. Also focus on locally hosted open source LLM models and use cases like tool-using (retrieval augmented chatbot), document ingest, agent.
Single Prompt Editing

Basic Features


Automatic output parsing and logging
Prompt version control

Variations - try concurrently


Experiment - models and params... dataset?

Allows rerun/sample multiple outputs to ensure robustness/reliability


Show rendered prompt
Set default choice for inputs

How about these (in other products):

Compare results side by side
Prompt Version diff
Generate dataset from chat (interactive discovery)

Template Management


Input values

Rich input data structure with editor UI


Components/inheritance

Version pinning and or specify multiple version to instantiate experiment


Quick extract
Guidance

Continuation


Experiment


Union of sets
Each set, each type of thing

Can be "Try all", "Constant", or "Try subset"


Workflow


Draft initial prompt

Can extract logical units
Structured inputs


First manual run
Try variations

Branch
Then "Run Experiment"


Later during review:

Can view past experiments

Experiment pinned to specific version of prompt
Can choose a "good" result and resume from there (load all states)


Can modify based on experiment result

eg token limit reached -> ask to extend answer (i.e. relax token limit)
Can manually intervene/edit
Quick extract for continuation


Can save a dataset for fine-tuning

Value added features


Prompt suggestion ability

Search on prompthub
AI suggestion based on magic prompt


Insert prompt snippet
Data auto-generation

Let LLM do it (!)


Data import

Publish and export

Once we're satisfied with the result.

Publish results in a fixed prompt + dataset.

Have a separate UI (playground) that let other users try it out quickly.


Export to download a file containing all relevant data

So that it can be used elsewhere, e.g. in an LLM application development framework.


Advanced feature: Human Eval

When running larger experiments, may want to out source evaluation to a team of human raters. Would like to have a separate part to manage this.
(Integration with Scale AI?)
Advanced feature: Finetuning

Single Prompt Editor UI Design

Load Prompt


Separate page
Category and keyword
Display versions + branches
List tagged versions
New Prompt

Basic option (empty, from template)
Wizard to guide beginner (choose base template + content from AI or prompthub)


Main UI


Two Column Panel
Left: Edit prompts or data (more later)
Right: Experiment Results

Filtered list of experiments (can change filter criteria)
Accordian to expand individual experiments


Alternative: Competition view of experiemnt

Current accepted choice on top
Candidates at bottom


Has other panels
Run Experiment

Using default
Use last config
New Experiment (full config)


Option panel


Set Models and Params here
Can have multiple instances (named)
Must have one default model and default param (used for quick run)

Prompt panel


Root Prompt and subprompts with tree view
List view of variations
Preview of rendered?

Data/Input panel


List view of all input variables
Natural text or structured data

Advanced JSON editor for structured data


Dataset (with a default) (also list view)

Experiment config UI


Big table
Each row for an adjustable input type (prompt variants, dataset, model, param)
First column: Run default, run specifics (ad hoc value?)
Second column: Filterable dropdown to choose (can also enter text + autocomplete)

Playground (For published prompts)


Show the compiled prompt template

Realtime update to show actual rendered final prompt when user input values to the variables?


Simplified input

Each variable has a textfield/JSON editor
Dropbox to choose example value


Prompt Chain Visual Editor