Skip to content

Instantly share code, notes, and snippets.

View LeTechLead's full-sized avatar
💭
I may be slow to respond.

LeTechLead LeTechLead

💭
I may be slow to respond.
View GitHub Profile

Qwen 3.6 27B Q4 vs Q6 Benchmark Results

Variants: Qwen3.6-27B-UD-Q4_K_XL.gguf and Qwen3.6-27B-UD-Q6_K_XL.gguf Tool: llama-benchy 0.3.5 Q4 run date: 2026-04-24 03:07:48 Q6 run date: 2026-04-24 03:20:34 Mode: latency / api

At a Glance

Qwen 3.6 27B AWQ INT4 Benchmark Results

Model: cyankiwi-qwen3.6-27B-AWQ-BF16-INT4 Tool: llama-benchy 0.3.5 Run date: 2026-04-22 23:52:59 Mode: latency / generation

At a Glance

  • Prefill throughput (pp*) trends down as context depth increases (d4096 -> d16384 -> d32768).