Skip to content

Instantly share code, notes, and snippets.

View MFDev-Arun's full-sized avatar

Arun Mj MFDev-Arun

View GitHub Profile
@MFDev-Arun
MFDev-Arun / ai-strategies.md
Last active April 28, 2025 16:03
ai-strategies.md

Here are key strategies to improve RAG systems when dealing with large document collections:

1. Optimize Document Chunking

Proper document chunking is crucial for effective retrieval:

  • Use semantic chunking instead of fixed-size chunking to preserve context

    We use fixed size chunking - but it wont cut trough midle of a sentence. It uses NLP to identify sentence boundaries

  • Experiment with overlap between chunks (typically 10-20%)