Skip to content

Instantly share code, notes, and snippets.

@donbr
Last active June 5, 2024 15:44
Show Gist options
  • Save donbr/5f952f52dcbdf18a8f2dac8aaffd2be4 to your computer and use it in GitHub Desktop.
Save donbr/5f952f52dcbdf18a8f2dac8aaffd2be4 to your computer and use it in GitHub Desktop.
RAG from Scratch

Summary of LangChain's RAG from scratch video series

Source: https://www.youtube.com/playlist?list=PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x

Video Summary Links
Part 1 (Overview) Introduces RAG, outlining the series from basic concepts to advanced techniques. Code
Slides
Part 2 (Indexing) Focuses on the indexing process, crucial for retrieval accuracy and speed. Code
Slides
Part 3 (Retrieval) Discusses document search using an index for precision in retrieval. Code
Slides
Part 4 (Generation) Explores RAG prompt construction for answer generation through LLMs. Code
Slides
Part 5 (Multi Query) Explains multi-query rewriting for diverse document retrieval. Slides
Code
Part 6 (RAG Fusion) Introduces RAG-fusion, combining multiple retrieval results for improved ranking. Slides
Code
Reference
Part 7 (Decomposition) Discusses breaking down complex questions into sub-questions for nuanced answering. Slides
Code
Reference 1
Reference 2
Part 8 (Step Back) Explores step-back prompting for generating abstract questions leading to fundamental understanding. Slides
Code
Reference
Part 9 (HyDE) Introduces HyDE for generating hypothetical documents to align better with index documents. Slides
Code
Reference
Part 10 (Routing) Focuses on logical and semantic query routing for directing queries to relevant data sources. Notebook
Slides
Part 11 (Query Structuring) Covers converting natural language queries into structured queries for efficient database interaction. Code
References
Part 12 (Multi-Representation Indexing) Discusses indexing document summaries for retrieval while linking them to full documents for comprehensive understanding. Code
References
Part 13 (RAPTOR) Introduces RAPTOR for summarizing and clustering documents to capture high-level concepts. Code
References
Part 14 (ColBERT) Explores ColBERT for enhanced token-based retrieval within RAG frameworks. Code
References

Final Draft: Comprehensive Review of the RAG From Scratch Video Series

Introduction The RAG From Scratch video series offers a deep dive into Retrieval Augmented Generation (RAG), a powerful approach that combines the strengths of large language models (LLMs) with external data sources. This report provides a detailed summary and analysis of each video in the series, highlighting key concepts, strengths, weaknesses, and practical applications.

Part 1 (Overview)

  • Introduces RAG as a method to connect LLMs with external data sources, enabling access to up-to-date and domain-specific information.
  • Outlines the three main stages of RAG: indexing, retrieval, and generation.
  • Strengths: Enhances LLMs' knowledge and generates more relevant answers.
  • Weaknesses: Requires careful design and optimization, which can be complex and resource-intensive.
  • Practical Applications: RAG can be applied in various domains, such as customer support, research, and content generation, where access to current and specialized information is crucial.

Part 2 (Indexing)

  • Focuses on the indexing process: loading documents, splitting them into chunks, and creating embeddings for efficient retrieval.
  • Discusses traditional (e.g., TF-IDF) and modern (e.g., transformer-based) embedding methods.
  • Strengths: Proper indexing is crucial for accurate and fast retrieval, impacting the quality of generated answers.
  • Weaknesses: Indexing large document collections can be time-consuming and computationally expensive.
  • Practical Applications: Efficient indexing techniques can be applied in search engines, recommendation systems, and content management platforms to enable fast and accurate information retrieval.

Part 3 (Retrieval)

  • Covers the retrieval process, focusing on semantic similarity search in high-dimensional embedding space.
  • Explains k-nearest neighbor (KNN) search for retrieving relevant documents.
  • Strengths: Efficient retrieval algorithms like KNN enable fast and accurate identification of relevant documents.
  • Weaknesses: Effectiveness depends on the quality of embeddings and similarity metrics, which may require domain-specific fine-tuning.
  • Practical Applications: Retrieval techniques can be used in question-answering systems, chatbots, and information retrieval platforms to provide relevant information based on user queries.

Part 4 (Generation)

  • Discusses the generation process, introducing prompt templates populated with retrieved documents and the original query.
  • Explains how LLMs generate answers based on the provided context.
  • Strengths: RAG allows LLMs to generate answers grounded in retrieved documents, providing more accurate and context-aware responses.
  • Weaknesses: The quality of generated answers depends on the relevance of retrieved documents and the LLM's ability to synthesize information effectively.
  • Practical Applications: RAG can be applied in virtual assistants, knowledge bases, and content creation tools to generate informative and contextually relevant responses.

Part 5 (Multi Query)

  • Introduces the multi-query approach, generating multiple query variations to capture different perspectives and improve retrieval.
  • Strengths: Improves retrieval coverage and diversity, increasing the likelihood of finding relevant documents.
  • Weaknesses: Generating effective query variations requires careful prompt engineering and may increase computational cost.
  • Practical Applications: Multi-query techniques can be used in search engines, research tools, and content recommendation systems to provide diverse and comprehensive results.

Part 6 (RAG Fusion)

  • Presents RAG Fusion, combining the results of multiple query variations using reciprocal rank fusion.
  • Strengths: Provides a more robust and effective ranking of retrieved documents by leveraging collective relevance signals.
  • Weaknesses: Effectiveness depends on the quality and diversity of generated queries and may be computationally expensive.
  • Practical Applications: RAG Fusion can be applied in search engines, question-answering systems, and content retrieval platforms to improve the relevance and quality of results.

Part 7 (Decomposition)

  • Explores query decomposition, breaking down complex questions into smaller, more manageable sub-questions.
  • Discusses sequential and independent sub-question solving approaches.
  • Strengths: Allows for more focused retrieval and generation, enabling the system to handle complex, multi-faceted questions effectively.
  • Weaknesses: Decomposing questions correctly and efficiently can be challenging, requiring advanced natural language understanding and reasoning capabilities.
  • Practical Applications: Query decomposition can be used in complex question-answering systems, research tools, and knowledge management platforms to provide detailed and comprehensive answers to multi-faceted queries.

Part 8 (Step Back)

  • Introduces step-back prompting, generating a more abstract, higher-level question from the original query.
  • Strengths: Helps the system focus on the underlying concepts and principles required to answer the question, improving answer quality and coherence.
  • Weaknesses: Generating effective step-back questions requires a deep understanding of the domain and relationships between concepts.
  • Practical Applications: Step-back prompting can be applied in educational tools, research assistants, and knowledge exploration systems to guide users towards understanding fundamental concepts and principles.

Part 9 (HyDE)

  • Presents Hypothetical Document Embeddings (HyDE), generating hypothetical documents based on the input question to improve retrieval relevance.
  • Strengths: Bridges the gap between the user's question and indexed documents, enabling more effective retrieval and generation.
  • Weaknesses: Generating high-quality hypothetical documents requires a strong language model and domain-specific knowledge, which may be resource-intensive.
  • Practical Applications: HyDE can be used in specialized search engines, research tools, and knowledge bases where the available documents may not directly match user queries.

Part 10 (Routing)

  • Discusses query routing techniques, including logical and semantic routing, for directing queries to the most relevant data sources.
  • Strengths: Enables efficient utilization of multiple data sources, improving scalability and specificity of the RAG system.
  • Weaknesses: Requires accurate query classification and well-defined mappings between queries and data sources, which may be challenging for complex or ambiguous queries.
  • Practical Applications: Query routing can be applied in enterprise search systems, knowledge management platforms, and multi-domain question-answering systems to efficiently direct queries to the most relevant data sources.

Part 11 (Query Structuring)

  • Covers the process of converting natural language queries into structured queries compatible with different data sources (e.g., SQL, Cypher).
  • Focuses on query structuring for vector stores using metadata filters.
  • Strengths: Allows RAG systems to leverage the capabilities of structured data sources, enabling more precise and efficient retrieval.
  • Weaknesses: Requires robust natural language understanding and domain-specific knowledge, which may be challenging for complex or ill-defined queries.
  • Practical Applications: Query structuring can be used in natural language interfaces for databases, knowledge graphs, and information retrieval systems to enable users to access structured data using natural language queries.

Part 12 (Multi-Representation Indexing)

  • Introduces multi-representation indexing, storing document summaries in a vector store for efficient retrieval while linking them to full documents for comprehensive understanding.
  • Discusses proposition indexing, using LLMs to generate concise summaries of document chunks optimized for retrieval.
  • Strengths: Enables fast retrieval of relevant summaries while providing access to the full document context, improving the quality and efficiency of the RAG system.
  • Weaknesses: Generating high-quality summaries requires a capable LLM and may introduce additional computational overhead during the indexing process.
  • Practical Applications: Multi-representation indexing can be applied in research platforms, content management systems, and knowledge bases to provide quick access to relevant information while preserving the option to dive deeper into the full context.

Part 13 (RAPTOR)

  • Presents RAPTOR (Retrieval Augmented Pre-Training for Open-domain Question Answering), a technique for indexing and retrieving documents at different levels of abstraction.
  • Creates a hierarchical tree structure by recursively clustering and summarizing documents.
  • Strengths: Provides a flexible and efficient way to retrieve information from large document collections, accommodating both high-level and low-level questions.
  • Weaknesses: Building the RAPTOR index can be computationally expensive, and effectiveness may depend on the quality of clustering and summarization algorithms.
  • Practical Applications: RAPTOR can be used in large-scale question-answering systems, research platforms, and knowledge management tools to efficiently retrieve relevant information at various levels of granularity.

Part 14 (ColBERT)

  • Introduces ColBERT (Contextualized Late Interaction over BERT), a technique for efficient and effective retrieval using contextualized embeddings.
  • Generates token-level embeddings for queries and documents, enabling fine-grained matching and improved retrieval performance.
  • Strengths: Provides a powerful and flexible retrieval mechanism that captures contextual information at a granular level, improving the relevance of retrieved documents.
  • Weaknesses: Generating token-level embeddings can be computationally expensive and may require significant storage resources for large document collections.
  • Practical Applications: ColBERT can be applied in search engines, question-answering systems, and content retrieval platforms to enhance the relevance and quality of retrieved documents by considering contextual information at a fine-grained level.

Conclusion

The RAG From Scratch video series offers a comprehensive exploration of Retrieval Augmented Generation, covering a wide range of techniques and approaches. Each video focuses on a specific aspect of RAG, providing valuable insights into the strengths, weaknesses, and practical applications of each method.

By understanding the comparative advantages and limitations of these techniques, developers and researchers can make informed decisions when designing and implementing RAG systems for their specific use cases. The series emphasizes the importance of careful design, optimization, and consideration of computational resources in building effective RAG systems.

As the field of RAG continues to evolve, it is crucial to stay updated with the latest advancements and best practices. The RAG From Scratch series serves as an excellent foundation for anyone interested in exploring and applying RAG techniques in various domains, such as question-answering, content generation, and information retrieval.

By leveraging the insights gained from this series and considering the practical applications discussed, organizations can harness the power of RAG to enhance their AI systems, improve user experiences, and unlock the vast potential of combining large language models with external knowledge sources.

{
"kind": "youtube#playlistItemListResponse",
"etag": "4oVDxXOqlyaVnRrnJXuhwXPhMSg",
"items": [
{
"kind": "youtube#playlistItem",
"etag": "bh6WIsa4tB6e6DgavMaPsvdBOOA",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC41NkI0NEY2RDEwNTU3Q0M2",
"snippet": {
"publishedAt": "2024-02-06T01:10:40Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG From Scratch: Part 1 (Overview)",
"description": "LLMs are a powerful new platform, but they are not always trained on data that is relevant for our tasks. This is where retrieval augmented generation (or RAG) comes in: RAG is a general methodology for connecting LLMs with external data sources such as private or recent data. It allows LLMs to use external data in generation of their output. This video series will build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation. It will build up to more advanced techniques to address edge cases or challenges in RAG. \n\nCode: \nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_1_to_4.ipynb\n\nSlides:\nhttps://docs.google.com/presentation/d/1C9IaAwHoWcc4RSTqo-pCoN3h0nCgqV2JEYZUJunv_9Q/edit?usp=sharing",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/wd7TZ4w1mSw/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/wd7TZ4w1mSw/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/wd7TZ4w1mSw/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/wd7TZ4w1mSw/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/wd7TZ4w1mSw/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 0,
"resourceId": {
"kind": "youtube#video",
"videoId": "wd7TZ4w1mSw"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "OrWZ0tkCTlBEu4hz0l7BXYAwEEo",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC4yODlGNEE0NkRGMEEzMEQy",
"snippet": {
"publishedAt": "2024-02-06T01:13:33Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG From Scratch: Part 2 (Indexing)",
"description": "This is the second video in our series on RAG. The aim of this series is to build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation. This video focuses on indexing, covering the process of document loading, splitting, and embedding. \n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_1_to_4.ipynb\n\nSlides:\nhttps://docs.google.com/presentation/d/1MhsCqZs7wTX6P19TFnA9qRSlxH3u-1-0gWkhBiDG9lQ/edit?usp=sharing",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/bjb_EMsTDKI/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/bjb_EMsTDKI/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/bjb_EMsTDKI/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/bjb_EMsTDKI/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/bjb_EMsTDKI/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 1,
"resourceId": {
"kind": "youtube#video",
"videoId": "bjb_EMsTDKI"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "UCSNsEg-YbO7tKu0pOB0nfymMXk",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC4wMTcyMDhGQUE4NTIzM0Y5",
"snippet": {
"publishedAt": "2024-02-06T01:15:48Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG From Scratch: Part 3 (Retrieval)",
"description": "This is the third video in our series on RAG. The aim of this series is to build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation. This video focuses on retrieval, covering the process of document search using an index. \n\nCode: \nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_1_to_4.ipynb\n\nSlides:\nhttps://docs.google.com/presentation/d/124I8jlBRCbb0LAUhdmDwbn4nREqxSxZU1RF_eTGXUGc/edit?usp=sharing",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/LxNVgdIz9sU/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/LxNVgdIz9sU/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/LxNVgdIz9sU/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/LxNVgdIz9sU/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/LxNVgdIz9sU/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 2,
"resourceId": {
"kind": "youtube#video",
"videoId": "LxNVgdIz9sU"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "wBp5fk6L21eWY7ePeUApyZPOx8c",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC41MjE1MkI0OTQ2QzJGNzNG",
"snippet": {
"publishedAt": "2024-02-06T01:17:42Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG From Scratch: Part 4 (Generation)",
"description": "This is the fourth video in our series on RAG. The aim of this series is to build up an understanding of RAG from scratch, starting with the basics of indexing, retrieval, and generation. This video focuses on generation, covering the process of RAG prompt construction and passing the prompt to an LLM for answer generation. \n\nCode: \nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_1_to_4.ipynb\n\nSlides:\nhttps://docs.google.com/presentation/d/1eRJwzbdSv71e9Ou9yeqziZrz1UagwX8B1kL4TbL5_Gc/edit?usp=sharing",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/Vw52xyyFsB8/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/Vw52xyyFsB8/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/Vw52xyyFsB8/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/Vw52xyyFsB8/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/Vw52xyyFsB8/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 3,
"resourceId": {
"kind": "youtube#video",
"videoId": "Vw52xyyFsB8"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "5RW4Ks3n7MMbp5RIlLADA1XSXvs",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC4wOTA3OTZBNzVEMTUzOTMy",
"snippet": {
"publishedAt": "2024-02-13T23:40:10Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 5 (Query Translation -- Multi Query)",
"description": "Query rewriting is a popular strategy to improve retrieval. Multi-query is an approach that re-writes a question from multiple perspectives, performs retrieval on each re-written question, and takes the unique union of all docs. \n\nSlides:\nhttps://docs.google.com/presentation/d/15pWydIszbQG3Ipur9COfTduutTZm6ULdkkyX-MNry8I/edit?usp=sharing\n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/JChPi0CRnDY/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/JChPi0CRnDY/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/JChPi0CRnDY/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/JChPi0CRnDY/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/JChPi0CRnDY/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 4,
"resourceId": {
"kind": "youtube#video",
"videoId": "JChPi0CRnDY"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "ehNU7b7eFhg6O-rmTMDd3gw3ty8",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC4xMkVGQjNCMUM1N0RFNEUx",
"snippet": {
"publishedAt": "2024-02-13T23:41:29Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 6 (Query Translation -- RAG Fusion)",
"description": "Query rewriting is a popular strategy to improve retrieval. RAG-fusion is an approach that re-writes a question from multiple perspectives, performs retrieval on each re-written question, and performs reciprocal rank fusion on the results from each retrieval, giving a consolidated ranking. \n\nSlides:\nhttps://docs.google.com/presentation/d/1EwykmdVSQqlh6XpGt8APOMYp4q1CZqqeclAx61pUcjI/edit?usp=sharing\n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb\n\nReference:\nhttps://github.com/Raudaschl/rag-fusion",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/77qELPbNgxA/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/77qELPbNgxA/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/77qELPbNgxA/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/77qELPbNgxA/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/77qELPbNgxA/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 5,
"resourceId": {
"kind": "youtube#video",
"videoId": "77qELPbNgxA"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "mIH_4G2OvYFmmWR3gQwjd49ytPc",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC5GNjNDRDREMDQxOThCMDQ2",
"snippet": {
"publishedAt": "2024-02-19T04:14:40Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 7 (Query Translation -- Decomposition)",
"description": "Query decomposition is a strategy to improve question-answering by breaking down a question into sub-questions. These can either be (1) solved sequentially or (2) independently answered followed by consolidation into a final answer. \n\nSlides:\nhttps://docs.google.com/presentation/d/1O97KYrsmYEmhpQ6nkvOVAqQYMJvIaZulGFGmz4cuuVE/edit?usp=sharing\n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb\n\nReference:\nhttps://arxiv.org/abs/2205.10625\nhttps://arxiv.org/pdf/2212.10509",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/h0OPWlEOank/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/h0OPWlEOank/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/h0OPWlEOank/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/h0OPWlEOank/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/h0OPWlEOank/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 6,
"resourceId": {
"kind": "youtube#video",
"videoId": "h0OPWlEOank"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "U2y8lrlj2L9S7lNNiIR9lh_wv0I",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC5DQUNERDQ2NkIzRUQxNTY1",
"snippet": {
"publishedAt": "2024-02-13T23:42:59Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 8 (Query Translation -- Step Back)",
"description": "Step-back prompting is an approach to improve retrieval that builds on chain-of-thought reasoning. From a question, it generates a step-back (higher level, more abstract) question that can serve as a precondition to correctly answering the original question. This is especially useful in cases where background knowledge or more fundamental understanding is helpful to answer a specific question.\n\nSlides:\nhttps://docs.google.com/presentation/d/1L0MRGVDxYA1eLOR0L_6Ze1l2YV8AhN1QKUtmNA-fJlU/edit?usp=sharing\n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb\n\nReference:\nhttps://arxiv.org/pdf/2310.06117.pdf",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/xn1jEjRyJ2U/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/xn1jEjRyJ2U/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/xn1jEjRyJ2U/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/xn1jEjRyJ2U/sddefault.jpg",
"width": 640,
"height": 480
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 7,
"resourceId": {
"kind": "youtube#video",
"videoId": "xn1jEjRyJ2U"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "ljjM0Iz56UCve8ACIj1ubDrbLE0",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC45NDk1REZENzhEMzU5MDQz",
"snippet": {
"publishedAt": "2024-02-13T23:43:42Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 9 (Query Translation -- HyDE)",
"description": "HyDE (Hypothetical Document Embeddings) is an approach to improve retrieval that generates hypothetical documents that could be used to answer the user input question. These documents, drawn from the LLMs knowledge, are embedded and used to retrieve documents from an index. The idea is that hypothetical documents may be better aligned with the indexes documents than the raw user question.\n\nSlides:\nhttps://docs.google.com/presentation/d/10MmB_QEiS4m00xdyu-92muY-8jC3CdaMpMXbXjzQXsM/edit?usp=sharing\n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb\n\nReference:\nhttps://arxiv.org/pdf/2212.10496.pdf",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/SaDzIVkYqyY/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/SaDzIVkYqyY/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/SaDzIVkYqyY/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/SaDzIVkYqyY/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/SaDzIVkYqyY/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 8,
"resourceId": {
"kind": "youtube#video",
"videoId": "SaDzIVkYqyY"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "6bm3blrctNSJ3grKHlXYzIuPBVs",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC40NzZCMERDMjVEN0RFRThB",
"snippet": {
"publishedAt": "2024-03-18T00:16:24Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "Private video",
"description": "This video is private.",
"thumbnails": {},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 9,
"resourceId": {
"kind": "youtube#video",
"videoId": "S9njv889Q-4"
}
}
},
{
"kind": "youtube#playlistItem",
"etag": "wy8EzPaxjcQck7u3wkT7pcgCLdQ",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC5EMEEwRUY5M0RDRTU3NDJC",
"snippet": {
"publishedAt": "2024-03-18T03:17:19Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 10 (Routing)",
"description": "This is the 10th video in our RAG From Scratch series, focused on different types of query routing (logical and semantic).\n\nNotebook:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_10_and_11.ipynb\n\nSlides:\nhttps://docs.google.com/presentation/d/1kC6jFj8C_1ZXDYcFaJ8vhJvCYEwxwsVqk2VVeKKuyx4/edit?usp=sharing",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/pfpIndq7Fi8/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/pfpIndq7Fi8/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/pfpIndq7Fi8/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/pfpIndq7Fi8/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/pfpIndq7Fi8/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 10,
"resourceId": {
"kind": "youtube#video",
"videoId": "pfpIndq7Fi8"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "iU2FyfOFyCmCBnFtPqDPJ0mMjvw",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC45ODRDNTg0QjA4NkFBNkQy",
"snippet": {
"publishedAt": "2024-03-19T02:56:43Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "Private video",
"description": "This video is private.",
"thumbnails": {},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 11,
"resourceId": {
"kind": "youtube#video",
"videoId": "ktRb17mAwYc"
}
}
},
{
"kind": "youtube#playlistItem",
"etag": "FaiIL3YcX988EXolSM4CI4_QB6I",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC4zMDg5MkQ5MEVDMEM1NTg2",
"snippet": {
"publishedAt": "2024-03-26T04:38:06Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "Private video",
"description": "This video is private.",
"thumbnails": {},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 12,
"resourceId": {
"kind": "youtube#video",
"videoId": "Ecp6uRKr9PI"
}
}
},
{
"kind": "youtube#playlistItem",
"etag": "LaUdFF4ObqU0XeKwSpR6eN3FRuY",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC41Mzk2QTAxMTkzNDk4MDhF",
"snippet": {
"publishedAt": "2024-03-27T03:20:04Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 11 (Query Structuring)",
"description": "Our RAG From Scratch video series walks through impt RAG concepts in short / focused videos w/ code. \n\nProblem: We interact w/ databases using domain-specific languages (e.g., SQL, Cypher for Relational and Graph DBs). And, many vectorstores have metadata that can allow for structured queries to filter chunks. But RAG systems ingest questions in natural language.\n\nIdea: A great deal of work has focused on query structuring, the process of text-to-DSL where DSL is a domain specific language required to interact with a given database. This converts user questions into structured queries. Below are links that dive into text-to-SQL/Cypher, and the below video overviews query structuring for vectorstores using function calling.\n\nCode: \nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_10_and_11.ipynb\n\nReferences:\n1/ Blog with links to various tutorials and templates:\nhttps://blog.langchain.dev/query-construction/\n2/ Deep dive on graphDBs (c/o @neo4j):\nhttps://blog.langchain.dev/enhancing-rag-based-applications-accuracy-by-constructing-and-leveraging-knowledge-graphs/\n3/ Query structuring docs:\nhttps://python.langchain.com/docs/use_cases/query_analysis/techniques/structuring\n4/ Self-query retriever docs:\nhttps://python.langchain.com/docs/modules/data_connection/retrievers/self_query",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/kl6NwWYxvbM/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/kl6NwWYxvbM/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/kl6NwWYxvbM/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/kl6NwWYxvbM/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/kl6NwWYxvbM/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 13,
"resourceId": {
"kind": "youtube#video",
"videoId": "kl6NwWYxvbM"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "GuxTLwalXLBGjHm1coNdkNdz0iI",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC5EQUE1NTFDRjcwMDg0NEMz",
"snippet": {
"publishedAt": "2024-03-27T23:11:40Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG from scratch: Part 12 (Multi-Representation Indexing)",
"description": "Our RAG From Scratch video series walks through impt RAG concepts in short / focused videos w/ code. This is the 12th video in our series and focuses on some useful tricks for indexing full documents.\n\nProblem: Many RAG approaches focus on splitting documents into chunks and returning some number upon retrieval for the LLM. But chunk size and chunk number can be brittle parameters that many user find difficult to set; both can significantly affect results if they do not contain all context to answer a question.\n\nIdea: Proposition indexing (@tomchen0 et al) is a nice paper that uses an LLM to produce document summaries (\"propositions\") that are optimized for retrieval. We've built on this with two retrievers: (1) multi-vector retriever embeds summaries, but returns full documents to the LLM. (2) parent-doc retriever embeds chunks but returns full documents to the LLM. Idea is to get best of both worlds: use smaller / concise representations (summaries or chunks) to retrieve, but link them to full documents / context for generation.\n\nThe approach is very general, and can be applied to tables or images: in both cases, index a summary but return the raw table or image for reasoning. This gets around challenges w/ directly embedding tables or images (multi-modal embeddings), using a summary as a representation for text-based similarity search.\n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_12_to_14.ipynb\n\nReferences:\n1/ Proposition indexing: https://arxiv.org/pdf/2312.06648.pdf\n2/ Multi-vector:\nhttps://python.langchain.com/docs/modules/data_connection/retrievers/multi_vector\n3/ Parent-document:\nhttps://python.langchain.com/docs/modules/data_connection/retrievers/parent_document_retriever\n4/ Blog applying this to tables:\nhttps://blog.langchain.dev/semi-structured-multi-modal-rag/\n5/ Blog applying this to images w/ eval:\nhttps://blog.langchain.dev/multi-modal-rag-template/",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/gTCU9I6QqCE/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/gTCU9I6QqCE/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/gTCU9I6QqCE/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/gTCU9I6QqCE/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/gTCU9I6QqCE/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 14,
"resourceId": {
"kind": "youtube#video",
"videoId": "gTCU9I6QqCE"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "W1eg_-mHu-3_NdzgKSmv4xItQBI",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC41QTY1Q0UxMTVCODczNThE",
"snippet": {
"publishedAt": "2024-03-28T23:57:42Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG From Scratch: Part 13 (RAPTOR)",
"description": "Our RAG From Scratch video series walks through impt RAG concepts in short / focused videos w/ code. \n \nProblem: \nRAG systems need to handle \"lower-level\" questions that reference specific facts found in a single document or \"higher-level\" questions that distill ideas that span many documents. Handling both types of questions can be a challenge with typical kNN retrieval where only a finite number of doc chunks are retrieved.\n\nIdea: \nRAPTOR (@parthsarthi03 et al) is a paper that addresses this by creating document summaries that capture higher-level concepts. It embeds and clusters documents, and then summarizes each cluster. It does this recursively, producing a tree of summaries with increasingly high-level concepts. The summaries and starting docs are indexed together, giving coverage across user questions. \n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_12_to_14.ipynb\nhttps://github.com/langchain-ai/langchain/blob/master/cookbook/RAPTOR.ipynb\n\nReferences:\n1/ Paper: https://arxiv.org/pdf/2401.18059.pdf\n2/ Longer deep dive: https://www.youtube.com/watch?v=jbGchdTL7d0",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/z_6EeA2LDSw/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/z_6EeA2LDSw/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/z_6EeA2LDSw/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/z_6EeA2LDSw/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/z_6EeA2LDSw/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 15,
"resourceId": {
"kind": "youtube#video",
"videoId": "z_6EeA2LDSw"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
},
{
"kind": "youtube#playlistItem",
"etag": "UMgauGMmYJbHeIOaoBOjZFabMbM",
"id": "UExmYUlERkVYdWFlMkxYYk8xX1BLeVZKaVEyM1p6dEEweC4yMUQyQTQzMjRDNzMyQTMy",
"snippet": {
"publishedAt": "2024-03-29T00:07:44Z",
"channelId": "UCC-lyoTfSrcJzA1ab3APAgw",
"title": "RAG From Scratch: Part 14 (ColBERT)",
"description": "Our RAG From Scratch video series walks through impt RAG concepts in short / focused videos w/ code. This is the 14th video in our series and focuses on indexing with ColBERT for fine-grained similarity search.\n\nProblem: Embedding models compress text into fixed-length (vector) representations that capture the semantic content of the document. This compression is very useful for efficient search / retrieval, but puts a heavy burden on that single vector representation to capture all the semantic nuance / detail of the doc. In some cases, irrelevant / redundant content can dilute the semantic usefulness of the embedding.\n\nIdea: ColBERT (@lateinteraction & @matei_zaharia) is a neat approach to address this with higher granularity embeddings: (1) produce a contextually influenced embedding for each token in the document and query. (2) score similarity between each query token and all document tokens. (3) take the max. (4) do this for all query tokens. (5) take the sum of the max scores (in step 3) for all query tokens to get the similarity score. \nThis results in a much more granular token-wise similarity assessment between document and query, and has shown strong performance. \n\nCode:\nhttps://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_12_to_14.ipynb\n\nReferences:\n1/ Paper:\nhttps://arxiv.org/abs/2004.12832\n\n2/ Nice review from @DataStax: \nhttps://hackernoon.com/how-colbert-helps-developers-overcome-the-limits-of-rag\n\n3/ Nice post from @simonw:\nhttps://til.simonwillison.net/llms/colbert-ragatouille\n\n4/ColBERT repo:\nhttps://github.com/stanford-futuredata/ColBERT\n\n5/ RAGatouille to support RAG w/ ColBERT:\nhttps://github.com/bclavie/RAGatouille",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/cN6S0Ehm7_8/default.jpg",
"width": 120,
"height": 90
},
"medium": {
"url": "https://i.ytimg.com/vi/cN6S0Ehm7_8/mqdefault.jpg",
"width": 320,
"height": 180
},
"high": {
"url": "https://i.ytimg.com/vi/cN6S0Ehm7_8/hqdefault.jpg",
"width": 480,
"height": 360
},
"standard": {
"url": "https://i.ytimg.com/vi/cN6S0Ehm7_8/sddefault.jpg",
"width": 640,
"height": 480
},
"maxres": {
"url": "https://i.ytimg.com/vi/cN6S0Ehm7_8/maxresdefault.jpg",
"width": 1280,
"height": 720
}
},
"channelTitle": "LangChain",
"playlistId": "PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x",
"position": 16,
"resourceId": {
"kind": "youtube#video",
"videoId": "cN6S0Ehm7_8"
},
"videoOwnerChannelTitle": "LangChain",
"videoOwnerChannelId": "UCC-lyoTfSrcJzA1ab3APAgw"
}
}
],
"pageInfo": {
"totalResults": 17,
"resultsPerPage": 20
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment