Skip to content

Instantly share code, notes, and snippets.

@donbr
donbr / ragas-implementation-guide.md
Last active April 25, 2025 02:34
RAGAS Implementation Guide

RAGAS Implementation Guide

This guide provides a streamlined approach to implementing RAGAS evaluation while managing OpenAI API rate limits effectively. It's designed to be straightforward, visual, and actionable.

Quick Overview

RAGAS (Retrieval Augmented Generation Assessment) is a framework for evaluating RAG systems with:

  • Objective metrics without human annotations
  • Synthetic test data generation
  • Comprehensive evaluation workflows
@donbr
donbr / rag-agent-evaluation.md
Last active April 24, 2025 18:55
Evaluating LLM Applications: From RAG to Agents with Ragas

Evaluating LLM Applications: From RAG to Agents with Ragas

1. Introduction

Large Language Models (LLMs) have revolutionized AI applications by enabling natural language understanding and generation capabilities. However, as these applications grow more sophisticated, ensuring their quality, reliability, and accuracy becomes increasingly challenging. Two key architectures in the LLM ecosystem are Retrieval-Augmented Generation (RAG) systems and LLM-powered agents.

This guide introduces the concepts of RAG systems and agents, explains their relationship, and presents the Ragas framework for evaluating their performance. We'll explore examples from two practical implementations: evaluating a RAG system and evaluating an agent application.

2. Understanding RAG Systems

@donbr
donbr / langsmith-prompt-versioning-rag.md
Last active April 24, 2025 05:43
Managing and Versioning Prompts in LangSmith for RAG Systems

Managing and Versioning Prompts in LangSmith for RAG Systems

1. Introduction

Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for enhancing Large Language Models (LLMs) with external knowledge [4]. At the heart of any effective RAG system lies well-crafted prompts that guide the retrieval and generation processes. As RAG systems move from development to production, managing these prompts becomes increasingly complex.

Prompt engineering for RAG systems presents unique challenges:

  • Context-sensitivity: RAG prompts must effectively incorporate retrieved information
  • Multi-step processes: Many RAG systems involve multiple prompts for different stages (query analysis, retrieval, generation)
@donbr
donbr / ragas-langsmith-evaluation.md
Created April 23, 2025 05:16
Synthetic Data Generation & RAG Evaluation: RAGAS + LangSmith

Synthetic Data Generation & RAG Evaluation: RAGAS + LangSmith

Introduction

Retrieval-Augmented Generation (RAG) has emerged as a powerful approach for enhancing Large Language Models (LLMs) with external knowledge. However, evaluating RAG pipelines presents significant challenges due to the complexity of retrieval quality, generation accuracy, and the overall coherence of responses. This document provides a comprehensive analysis of using RAGAS (Retrieval Augmented Generation Assessment) for synthetic test data generation and LangSmith for RAG pipeline evaluation, based on the Jupyter notebook example provided.

What is RAG?

Retrieval-Augmented Generation is a technique that enhances LLMs by providing them with relevant external knowledge. A typical RAG system consists of two main components[1]:

@donbr
donbr / discovery-proposal-template.md
Created April 22, 2025 22:14
Discovery Phase & Project Proposal Template

Discovery Phase & Project Proposal Template v0.1

License Notice: This template is provided under the Creative Commons Attribution 4.0 International License. It incorporates elements from SixArm/consulting-agreement (CC-BY-SA-4.0) and shimon/consulting-contract-template (Public Domain).

DISCLAIMER: This template is for informational purposes only and does not constitute legal advice. Consult with a qualified legal professional before using this template for your business.

Table of Contents

  1. Discovery Phase Agreement
  2. Discovery Phase Worksheet
@donbr
donbr / ragas-overview.md
Last active April 22, 2025 05:21
RAGAS: A Comprehensive Framework for RAG Evaluation and Synthetic Data Generation

RAGAS: A Comprehensive Framework for RAG Evaluation and Synthetic Data Generation

Abstract

Retrieval-Augmented Generation (RAG) systems have emerged as a powerful approach for enhancing Large Language Models (LLMs) with domain-specific knowledge. However, evaluating these systems poses unique challenges due to their multi-component nature and the complexity of assessing both retrieval quality and generation faithfulness. This paper provides a comprehensive examination of RAGAS (Retrieval Augmented Generation Assessment), an open-source framework that addresses these challenges through reference-free evaluation metrics and sophisticated synthetic data generation. RAGAS distinguishes itself through its knowledge graph-based approach to test set generation and specialized query synthesizers that simulate diverse query types. We analyze its capabilities, implementation architecture, and comparative advantages against alternative frameworks, while also addressing current limitations and future research dire

@donbr
donbr / mcp-llmstxt-config-guide.md
Last active April 22, 2025 03:19
Configuring MCP for llms.txt Files in Claude Desktop and Cursor

Configuring MCP for llms.txt Files in Claude Desktop and Cursor

Understanding llms.txt and MCP

Before configuring your MCP clients, it's important to understand the two components involved:

  1. llms.txt: A website index format that provides background information, guidance, and links to detailed documentation for LLMs. As described in the LangChain documentation, llms.txt is "an index file containing links with brief descriptions of the content"[1]. It acts as a structured gateway to a project's documentation.

  2. MCP (Model Context Protocol): A protocol enabling communication between AI agents and external tools, allowing LLMs to discover and use various capabilities. As stated by Anthropic, MCP is "an open protocol that standardizes how applications provide context to LLMs"[2].

@donbr
donbr / llms-txt-article.md
Created April 21, 2025 15:18
llms txt article

llms.txt: The New Standard Bridging Websites and AI

In today's digital landscape, Large Language Models (LLMs) like ChatGPT, Claude, and Gemini constantly navigate the web to gather information and provide answers. But there's a fundamental problem: websites were designed for human consumption, not AI understanding. From complex HTML structures to JavaScript-heavy interfaces, LLMs often struggle to extract meaningful content from the modern web.

Enter llms.txt – a proposed web standard that could revolutionize how AI systems interact with online content.

What Is llms.txt?

Proposed in September 2024 by Jeremy Howard, co-founder of Answer.AI, llms.txt is a markdown-formatted file placed at a website's root directory (e.g., example.com/llms.txt)[^1]. This standardized file provides concise, structured information and links to detailed content, designed specifically to help language models better understand and navigate websites[^2].

HOA PDF Chatbot: Solution Architecture v0.1

Date: April 21, 2025

NOTE: I created a sample solutions architecture document primarily for discussion purposes, covering different aspects of the overall solution.

  • one of the key things I was trying to validate in this document was whether the LLM was effectively using an indexed version of the LangChain / LangGraph documentation. Apparently it did not but it's a good starting point to iterate on.
  • A number of the solutions selected wouldn't necessarily be my first or second choice but I left them as is rather than picking a personal favorite.
  • I don't want to bias discussions - I want to find out what a prospective already uses and what they're familiar with, along with price point.

Executive Summary

@donbr
donbr / dfd-json-standards-paper.md
Created April 21, 2025 15:36
Standards Similar to DFDL for Converting Documents to JSON

Standards Similar to DFDL for Converting Documents to JSON

1. Introduction

In today's interconnected digital landscape, data exchanges between diverse systems necessitate effective transformation mechanisms. Organizations frequently need to convert data between different formats to ensure interoperability and seamless information flow. The Data Format Description Language (DFDL) has emerged as a powerful standard for modeling and describing text and binary data formats in a standardized way. This capability is crucial for legacy systems integration, data migration, and modern API interfaces.

JSON (JavaScript Object Notation) has become the de facto standard for data exchange in web applications, cloud services, and APIs due to its simplicity, human readability, and widespread support across programming languages. Converting various document formats to JSON is therefore a common requirement in many integration scenarios.

While DFDL provides a robust framework for describing and parsing diverse data fo