DGX Spark UMA can trick you
bartusiak.ai·1d·
Discuss: Hacker News
Flag this post

TL/DR

If you’re working with large language models (LLMs) on systems like the DGX Spark, and encountering “out of memory” errors despite having seemingly ample RAM (e.g., 128GB for a 7B parameter model), the culprit might be your operating system’s caching mechanisms. The solution is often as simple as dropping system caches.

  • DGX Spark uses UMA (Unified Memory Architecture): CPU and GPU share the same memory.
  • OS Caching: The OS aggressively uses memory for caches, which might not be visible to GPU tools.
  • CUDA vs. Actual Usage: DGX Dashboard’s memory usage (via CUDA API) might show high usage even without a model loaded due to OS caches.
  • The Fix: Clear system caches with sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'.
  • **It is mentioned…

Similar Posts

Loading similar posts...