DeepSeek V4 Flash optimized framework and model variants for DGX Spark (opens in new tab)
Mixed NVFP4 serving of DeepSeek V4 Flash on DGX Spark (GB10) - fork of antirez/ds4 with REAP expert pruning, NVFP4 quantization, FP8-packed KV cache, and managed-memory serving - sleepyeldrazi/ds4-...
Read the original article