Skip to content

Instantly share code, notes, and snippets.

@alessandrofuda
Last active April 7, 2026 09:40
Show Gist options
  • Select an option

  • Save alessandrofuda/c0513948003265e3548f288fef0e8ea1 to your computer and use it in GitHub Desktop.

Select an option

Save alessandrofuda/c0513948003265e3548f288fef0e8ea1 to your computer and use it in GitHub Desktop.
Short presentation of my last AI RAG project

LongTerMemory: Technical Overview

LongTerMemory is an AI-powered SaaS platform for exam preparation and long-term knowledge retention. It combines Retrieval-Augmented Generation (RAG) with spaced repetition scheduling to help users study smarter from their own materials.


What It Does

  1. Upload study materials , PDFs, Word documents, or web links
  2. Auto-generate Q&A pairs , LLM extracts key concepts and formulates questions from the content
  3. Spaced repetition scheduling , Reviews are scheduled using the SM-2 algorithm to maximize long-term retention

Architecture

React SPA (Vite 7)
       │
       ▼
 Laravel 12 REST API  ───────────────────────────────────┐
       │                                                 │
       ├── MySQL (users, projects, Q&A pairs, schedules) │
       ├── Redis (cache, sessions, Celery broker)        │
       └── MinIO (S3-compatible document storage)        │
                                                         │
 Python FastAPI + Celery ◄───────────────────────────────┘
       │
       ├── LlamaIndex (RAG orchestration)
       ├── OpenAI API (embeddings + LLM)
       └── Qdrant (vector database)

Service communication: Laravel triggers async Celery tasks for document processing. The RAG service notifies Laravel via a push callback (POST /api/job-finished) when Q&A generation completes, no polling needed.


Stack

Layer Technology
Frontend React 19, TypeScript, Vite 7, Tailwind CSS 4, shadcn/ui
Backend API Laravel 12 (PHP 8.2), Sanctum auth, Cashier (Stripe)
RAG Service Python 3.11, FastAPI, Celery 5, LlamaIndex
Vector DB Qdrant
Relational DB MySQL 8
Object Storage MinIO (S3-compatible)
Cache / Broker Redis
Email Resend
Payments Stripe
Infrastructure Docker Compose

Key Technical Decisions

Semantic Chunking with Dynamic Sizing

Documents are split using LlamaIndex's semantic splitter with parameters that adapt to content length:

  • Short content (<10k tokens): smaller chunks (1024 tokens) for precision
  • Long content (≥10k tokens): larger chunks (2048 tokens) to reduce noise and cost (~75% fewer chunks for book-length material)

Async Job Pipeline

Document processing runs entirely in the background via Celery workers. A Redis index (project_job:{project_id}) prevents duplicate jobs
per project. The RAG service returns HTTP 409 if a job is already in progress.

Authentication

Magic link flow: user enters email → receives a one-time code → exchanges it for a Sanctum API token stored in localStorage. No passwords.

Spaced Repetition Notifications

An hourly Artisan command checks for users in the 8 AM timezone window, queries due/new study items in bulk (2 queries, no N+1), and
dispatches queued email notifications with rate-limiting delays (Resend free tier: 2 req/s).

Commercial Plans & Trials

Three-tier plan system (Free/Basic/Pro) with a 14-day trial. Limits enforced at the API layer via middleware and request validation, max projects, document sizes, and upload counts per plan.


Test Coverage

  • Laravel: 51 PHPUnit tests (auth, document upload, study plans, notifications, commands)
  • RAG service: 340 pytest tests (document processing, embeddings, Celery tasks, job storage, API endpoints)

Infrastructure Notes

  • Development: all services exposed locally via Docker Compose
  • Production: only the Laravel API (port 8080) and Flower monitoring UI (port 5555, basic auth) are externally exposed; all other services are internal

Stack: Laravel · Python · React · OpenAI · Qdrant · Redis · MinIO · Stripe · Docker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment