Skip to content

Instantly share code, notes, and snippets.

@Tostino
Last active November 1, 2023 19:37
Show Gist options
  • Save Tostino/3f0b0887591ed06aa9f54ca2ddbd1707 to your computer and use it in GitHub Desktop.
Save Tostino/3f0b0887591ed06aa9f54ca2ddbd1707 to your computer and use it in GitHub Desktop.
Expandable Primitives for Enhanced LLM Performance

Expandable Primitives for Enhanced LLM Performance

Introduction

This project focuses on augmenting the LLM's capabilities by introducing structured primitives to refine its contextual environment and response aptitude. The primary objective is to enhance user experience, reduce latency, and improve the accuracy and relevance of responses.

Assumptions

  • Reliable external data sources are available and accessible.
  • The current LLM architecture allows for the integration of expandable primitives.
  • User experience can be significantly improved by reducing latency and enhancing context awareness.

Dependencies

  1. Data Source Integration must precede Scratchpad Buffer Development to ensure data flow.
  2. Advanced Primitives Implementation depends on the successful development of the Scratchpad Buffer.

Metrics & KPIs

  • Reduction in response latency
  • Increase in benchmark scores (humaleval, mmlu, OpenAI evals)
  • Decrease in error rates when accessing external data sources

Risks & Mitigations

  • Data Privacy Breach: Regular audits of data sources and end-to-end encryption will mitigate this.
  • System Overload: Implementing adaptive size adjustments and context expiry strategies will help manage this risk.

1. Data Source Integration

  • Identify and curate reliable data sources.
  • Evaluate data sources based on their data privacy policies and practices. Regular evaluations will be conducted every 6 months.
  • Develop a module to connect to external data sources with end-to-end encryption for data transmissions.
  • Implement mechanisms to assess the "trustworthiness" of an external source.
  • Create a protocol to handle data source outages or inconsistencies and communicate them to stakeholders.

2. Scratchpad Buffer Development

  • Design the Scratchpad buffer ensuring its compatibility with other system components.
  • Implement an error-logging system within the scratchpad for minor issues.
  • Develop algorithms for adaptive size adjustments based on system load.
  • Create mechanisms to rank or prioritize scratchpad elements.
  • Implement a "context compression" strategy to store more information in less space.
  • Detail the "context expiry" strategy, ensuring irrelevant data is phased out.

3. Advanced Primitives Implementation

  • Develop a stack-formatted section to store metadata about the current context or task. This section will be optimized for quick retrievals.
  • Implement a history tracker that retains past value changes and the corresponding process step, ensuring data older than a year is archived or deleted.
  • Add a task priority indicator.
  • Create a contextual depth indicator for complex tasks.

4. Context Retrieval & Injection

  • Develop methods to pull relevant context based on the current user query.
  • Design a strategy for when context retrieval fails. In such cases, the system will revert to a default response or ask the user for more clarity.
  • Implement a caching mechanism for frequently accessed context.
  • Monitor and manage the frequency of external queries, ensuring no single source is overburdened.

5. User Experience & Latency Management

  • Assess user experience implications of added latency. A feedback loop will be established for users to report lag.
  • Optimize network calls and database accesses.
  • Design a user-friendly interface for developers.

6. Reward Function & Training Adjustments

  • Refine the reward function considering time, scratchpad efficiency, and token usage.
  • Train the LLM to discern task-based outputs and direct commands.
  • Use MoE and gating router (possibly with top-k weighted averaging)
    • This routing network should be part of the training pipeline and should be easily retrainable when adding new expert loras.

Prompt template spec:

# Prompt Template Specification

The Prompt Template is designed to standardize and enhance interactions with the LLM, making its operation more transparent and effective. By structuring your interactions using this template, you can tap into advanced capabilities, control response behavior, and ensure robust context preservation.

## Sections Overview

1. **System**: Defines the system's nature.
2. **Task History**: Logs previous tasks with timestamps or step counters.
3. **Scratchpad**: Workspace for the LLM's intermediate thoughts.
4. **Command Log**: Tracks user-system command interactions.
5. **Chat**: Logs the direct user-system chat with step counters.
6. **Time Awareness**: Provides the system's awareness of real-world time and processing steps.
7. **Error Logs**: Logs any encountered errors.
8. **Output**: Captures the active thought process and instructions for external processes.

## Detailed Breakdown

### 1. `<#system#>`
   - **Purpose**: A constant reminder of the system's identity and role.
   
### 2. `<#task_history#>`
   - **Purpose**: To keep track of completed tasks.
   - **Usage**: Add tasks in order of completion with their respective step counters.

### 3. `<#scratchpad#>`
   - **Purpose**: A workspace for the LLM to jot down intermediate thoughts, computations, or details.
   - **Usage**: Record observations or calculations, preferably with step counters.

### 4. `<#command_log#>`
   - **Purpose**: To maintain a record of commands executed.
   - **Usage**: Log commands in reverse chronological order.

### 5. `<#chat#>`
   - **Purpose**: To keep a structured record of the chat.
   - **Usage**: Store user and bot messages with accompanying step counters.

### 6. `<#time_awareness#>`
   - **Purpose**: Provide a time context.
   - **Usage**: Update with real-world time and current step number.

### 7. `<#error_logs#>`
   - **Purpose**: For debugging and error tracking.
   - **Usage**: Log any issues or errors encountered with step counters, if possible.

### 8. `<#output#>`
   - **Purpose**: To guide external processes and capture active thought.
   - **Usage**: Utilize special tags to signal different types of output actions.

## Commands

- <#respond#>: Direct response that is sent to the <#bot#> section of the chatlog. If a stop token is created prior to the end tag, that indicates a multi-part response that needs to be appended to the same bot response.
     - This indicates that you should respond as as the <#bot#> in the <#chat#> section with your subsequent message, so all text will be routed to that section with a step counter reference
- <#scratch#ttl=100>: Signal to store the response within this tag for later use in the scratchpad for N generation step ttl.
     - This output should be routed to the <#scratchpad#> area, and a <#command_log#> that the scratchpad entry (referenced by ID) was created/updated/deleted at [step counter]
- <#scratch_expire#>: Signal to expire specific information from the scratchpad immediately 
     - This output <#command_log#> that the scratchpad entry (referenced by ID) was expired at [step counter]
- <#fetch#ttl=100>: Request to retrieve certain data or information for later use in the scratchpad for N generation step ttl.
     - This output <#command_log#> we fetched some data from memory (referenced by ID) at [step counter]
- <#update#field=[specific set of fields]>: Indication to update a certain part of the system or data with the value within this action for the specific field.
     - This output <#command_log#> that the scratchpad entry (referenced by ID) was updated at [step counter]
- <#restart_response#>: Trigger the in-progress response to be removed from the <#bot#> section
     - This output <#command_log#> that the response was restarted at [step counter]
- <#restart_output#>: Signal to remove the current incomplete output response.
     - This output <#command_log#> that the output was restarted at [step counter]

### Things to know
- Each command gets logged to the command log

Prompt template example:

## Prompt Template

<#system#>
You are a helpful AI assistant.
<#/system#>

<#task_history#>
- [Step_13]: Completed task of checking for software updates.
- [Step_1]: [Respond to user query about project plan]
<#/task_history#>

<#scratchpad#>
- [Step_15]: User seems to be interested in project planning.
- [Step_16]: Previous responses to such queries have included project milestones and dependencies.
<#/scratchpad#>

<#command_log#>
- [Step_18][running]: [Respond to project plan query]
- [Step_13]: [Think about solution and save it to scratchpad]
- [Step_10]: [Open the user's project document]
<#/command_log#>

<#chat#>
<#entry_Step_1#>
<#user#>
Project planning seems like an interesting career path.
<#bot#>
[response to user]
<#entry_Step_18#>
<#user#>
Tell me more about project planning?
<#bot#>
[waiting on response to be built]
<#/chat#>

<#time_awareness#>
- Wall-clock time: [4:30 PM]
- Process step: [Step_18]
<#/time_awareness#>

<#error_logs#>
- [Step_10]: Connection error accessing external data source.
<#/error_logs#>

<#output#>
<#respond#>Project planning involves setting clear goals, defining tasks, and allocating resources.<#/respond#>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment