valer-cara/rag.md Secret

## rag.md

      
    Raw
  

              rag.md
            
          
    RAG

2023-12-12 Preparing some workshop that I don't know much of, but have the task of analysing how competing orgs have approached Decentralization. What they promised, what they did, what they're looking to do.
Looking at zkSync at the moment.
Planning to pull articles from the internet, figure out vectorDB/indexes in langchain and use that to query the LLM on it's newly learned information.
The keyword in LLM is retrieval.
RAG (retrieval-augmented generation)

Probably use this to augment the LLM's knowledge base:

Langchain tutorial: https://python.langchain.com/docs/use_cases/question_answering/
Cloudflare RAG tutorial: https://developers.cloudflare.com/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai/
OpenAPI Cookbook - Evaluate RAG with LlamaIndex https://cookbook.openai.com/examples/evaluation/evaluate_rag_with_llamaindex ❤️

In RAG, your data is loaded and and prepared for queries or “indexed”. User queries act on the index, which filters your data down to the most relevant context. This context and your query then go to the LLM along with a prompt, and the LLM provides a response.
While it's beneficial to examine individual queries and responses at the start, this approach may become impractical as the volume of edge cases and failures increases. Instead, it may be more effective to establish a suite of summary metrics or automated evaluations. These tools can provide insights into overall system performance and indicate specific areas that may require closer scrutiny.
In a RAG system, evaluation focuses on two critical aspects:
Retrieval Evaluation: This assesses the accuracy and relevance of the information retrieved by the system.
Response Evaluation: This measures the quality and appropriateness of the responses generated by the system based on the retrieved information.
See https://docs.llamaindex.ai/en/stable/module_guides/evaluating/root.html


Beyond RAG? Vendor APIs?


OpenAI Assitants API https://python.langchain.com/docs/modules/agents/agent_types/openai_assistants
OpenAI Assitants, Knowledge Retrieval https://platform.openai.com/docs/assistants/tools/knowledge-retrieval

Embeddings


OpenAPI Embeddings tutorial: https://platform.openai.com/docs/tutorials/web-qa-embeddings
Visualizing Embeddings in 2d: https://platform.openai.com/docs/guides/embeddings/use-cases ❤️❤️

t-SNE used to reduce dimensionality: https://www.datacamp.com/tutorial/introduction-t-sne
Which distance function should I use? We recommend cosine similarity. The choice of distance function typically doesn’t matter much.
How can I retrieve K nearest embedding vectors quickly? For searching over many vectors quickly, we recommend using a vector database. You can find examples of working with vector databases and the OpenAI API in our Cookbook on GitHub. Vector database options include: Chroma, an open-source embeddings store;


Embedding: https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings.html

List of supported embeddings here
Local Embeddings with Hugging face: https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface.html


Indexing: https://docs.llamaindex.ai/en/stable/understanding/indexing/indexing.html

Prompting


OpenAI prompting strategies: https://platform.openai.com/docs/guides/prompt-engineering/six-strategies-for-getting-better-results

Cookbooks & Tutorials


https://cookbook.openai.com/ (vector databases, clustering, powerpoint presentations, ... very cool stuff)
https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/putting_it_all_together.html

Refs [TODO, review these]


URL Document loader (unstructured, selenium, playwright) https://python.langchain.com/docs/integrations/document_loaders/url
RAG (retrieval-augmented generation) ❤️https://python.langchain.com/docs/use_cases/question_answering/

OpenAI/Embeddings docs: https://platform.openai.com/docs/guides/embeddings
$0 Embeddings (OpenAI vs. free & open source) https://www.youtube.com/watch?v=QdDoFfkVkcw
RAG With The Right Embedding: https://mlnotes.substack.com/p/rag-with-the-right-embedding


Langchain Hub: https://smith.langchain.com/