Tim Kompas

## gist-README.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              1 star
            
          
                Kompas
                / gist-README.md
            
            
              Last active
              April 15, 2026 20:49
            
              
                SGLang + DFlash on DGX Spark (Qwen3-Coder-Next NVFP4) — 150 tok/s
              
          
    SGLang + DFlash on DGX Spark (Qwen3-Coder-Next NVFP4)

Running Qwen3-Coder-Next-NVFP4-GB10 with DFlash speculative decoding on SGLang.
Tested on a Lenovo ThinkStation PGX (NVIDIA GB10 Grace Blackwell, 128 GB unified memory).
What you get


Test
SGLang DFlash
vLLM DFlash+Marlin
Delta


Short code (307 tok)
150 tok/s
108 tok/s
+38%


## dflash-coder-next-forum-post.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                Kompas
                / dflash-coder-next-forum-post.md
            
            
              Last active
              April 14, 2026 12:53
            
              
                DFlash speculative decoding for Qwen3-Coder-Next on DGX Spark — 2-line vLLM patch, 88-108 tok/s
              
          
    Qwen3 Coder Next + DFlash on DGX Spark: 108 tok/s with a 2 line vLLM patch

I got DFlash speculative decoding working with Qwen3 Coder Next on my DGX Spark (GB10, 128 GB unified memory). Result: 88-108 tok/s depending on task complexity, up from 62 tok/s without DFlash.
Tool calling works too (--enable-auto-tool-choice --tool-call-parser qwen3_coder), tested at 89 tok/s with DFlash active. Useful if you're running coding agents that need function calls.
The fix turned out to be surprisingly simple: two lines of Python.
The Problem


## chatgpt-conversation-exporter.js
(() => {
  function formatDate(date = new Date()) {
    return date.toISOString().split("T")[0];
  }

  function escapeMarkdown(text) {
    return text
      .replace(/\\/g, "\\\\")
      .replace(/\*/g, "\\*")
      .replace(/_/g, "\\_")
	(() => {
	function formatDate(date = new Date()) {
	return date.toISOString().split("T")[0];
	}

	function escapeMarkdown(text) {
	return text
	.replace(/\\/g, "\\\\")
	.replace(/\/g, "\\")
	.replace(/_/g, "\\_")