Two Qwen3 models on one DGX Spark: the residency math (opens in new tab)
The residency math, the gpu_memory_utilization trap, and what to verify first. Notes from my experiments with local LLMs.
Read the original articleThe residency math, the gpu_memory_utilization trap, and what to verify first. Notes from my experiments with local LLMs.
Read the original article