Skip to content

Instantly share code, notes, and snippets.

@mail2mhossain
mail2mhossain / Comparison Table: DINOv2 vs CLIP vs BLIP vs SigLIP.md
Created September 4, 2025 14:35
Comparison Table: DINOv2 vs CLIP vs BLIP vs SigLIP
Model Type Strengths Weaknesses Best Use Case
DINOv2 (Meta, 2023) Vision-only self-supervised - Excellent pure visual embeddings (objects, scenes) - Strong on fine-grained features - Great for clustering & recognition - No text alignment (needs external captioner + text embedder)- Very heavy (ViT-g/14 = 1B+ params) Use when you need best visual understanding (object-level indexing) and can afford GPU compute.
CLIP (OpenAI, 2021) Vision–Language joint model - Strong cross-modal embeddings (text ↔ image) - Efficient for retrieval - Smaller versions available (ViT-B/32, ViT-L/14) - Widely adopted, easy to use - Slightly weaker than DINOv2 in pure visual features - Captions less descriptive than BLIP Best all-rounder for retrieval. Works well when you want efficient, deployable text + image search.
BLIP / BLIP-2 (Salesforce, 2022) Captioning + Vision–Language - Excellent caption generation - Good at VQA and multimodal tasks - Captions can
@mail2mhossain
mail2mhossain / Summary Comparison Table.md
Created July 4, 2025 14:54
Summary Comparison Table
Feature Query Transformation Multi-Query Generation Query Decomposition
What it does Rewrites a single query Creates multiple query variants Splits query into sub-questions
Goal Improve phrasing for better match Broaden retrieval coverage Handle complexity, improve accuracy
Input One original query One original query One multi-part or complex query
Output One transformed query Several alternative queries Several focused sub-questions
When to use Ambiguous or informal queries Low recall or vocabulary mismatch Multi-part or reasoning-heavy queries
Pitfall Issue Solution
Embedding chunks too large Silent truncation reduces accuracy. Limit chunks to embedding model's maximum length.
Zero overlap Sentences split across chunks. Use recommended overlaps (10–15%).
Too many parent chunks per query Introduces noise, reducing LLM accuracy. Keep chunks-per-query (k) around 4–8.
No reserved context space for the LLM’s response Risk of truncation or rejected prompts. Always reserve ~30% of context space.
Child Token range (start → end) Effective length
1 0 → 399 400
2 339 → 474 136
Level Recommended Overlap Quick Formula
Parent 8 – 12% round(parent_chunk × 0.10)
Child 12 – 18% round(child_chunk × 0.15)
Term How to pick it Why
usable_window (context_window × 0.70) (reserve 30 % for system prompt, user question, and the model’s answer) Prevents context overflow and truncation.
k Your chosen chunks-per-query (e.g. 6) Spreads the available space evenly.
Parameter Example Value Source of Information
LLM context window phi-3-mini-4k (4,096 tokens) Model configuration (max_position_embeddings)
Embedding-model limit BAAI/bge-large-en-v1.5 (512 tokens) Model card (max_seq_length)
Chunks-per-query (k) Typically 6 parent chunks per user query Pipeline design (commonly between 4–8)
Level Purpose in the Pipeline Typical Size Storage Location
Parent chunk The passage sent directly to the LLM prompt. ≈ 1,000 – 2,000 tokens Stored in NoSQL (MongoDB)
Child chunk Smaller slices used for embedding and search. ≈ 300 – 500 tokens Stored in vector DB
Tool / Library What It Is How It Helps in This Architecture
MediatR Lightweight in-process mediator (implements the Mediator pattern). Dispatches Commands, Queries, and Domain Events without tight coupling. Keeps the Application layer free of direct service references—perfect for CQRS and Clean Architecture.
MassTransit
@mail2mhossain
mail2mhossain / Deposit Funds Example.md
Created June 19, 2025 16:57
Deposit Funds Example
Architectural Part Where It Appears Key Insight
DDD Account.Deposit + FundsDeposited Invariant & business language live in the aggregate.
Clean Architecture DepositFundsHandler Application layer orchestrates use-case; no framework coupling.
Event-Driven (EDA) PublishAsync + projector Domain event travels via MediatR/outbox → Kafka → projector.
CQRS Write side = Account aggregate; read side = BalanceProjectionDb Fast, denormalised reads with eventual consistency.