RT by @awnihannun: I've been working on bringing some exciting new beta features to LM Studio's MLX engine. Vision models now have automatic prefix caching, pro... (opens in new tab)
I've been working on bringing some exciting new beta features to LM Studio's MLX engine. Vision models now have automatic prefix caching, prompt cache disk-offloading, and continuous batching. If these sound interesting to you, try it out and lmk what you think.
Read the original article