The DEVONtechnologies Blog
November 11, 2025 — Jim Neumann

If you want to use AI in DEVONthink, you can choose between server-based models (the big ones) and local models. Whether it’s for privacy reasons or just the novelty, running AI locally on your Mac may be something you’re considering. Here are some thoughts on choosing a local model.
First, you’ll need an AI application for hosting and running the model. The two directly supported in DEVONthink are LM Studio and Ollama. If you go to either site, look fo…
The DEVONtechnologies Blog
November 11, 2025 — Jim Neumann

If you want to use AI in DEVONthink, you can choose between server-based models (the big ones) and local models. Whether it’s for privacy reasons or just the novelty, running AI locally on your Mac may be something you’re considering. Here are some thoughts on choosing a local model.
First, you’ll need an AI application for hosting and running the model. The two directly supported in DEVONthink are LM Studio and Ollama. If you go to either site, look for a Models link where you can view a curated list of models. Regarding downloading models, see each app’s documentation.
Running local AI effectively is largely tied to the amount of memory (RAM) your machine has. The more memory is available, the larger the model you can use. As a non-technical rule of thumb: calculate half to 70% of your total RAM and use that number as the number of parameters of the model you use. For example, on a Mac with 16GB RAM, 50% is 8 and 70% is 11.2. So when you’re looking for models, look for ones with 8B (billion) up to 12B in the name. These will not be as speedy (or as accurate) as commercial models – compared to the large models, that’s actually not many parameters – but using them should also not lock up your machine from a lack of resources.
Also, you’ll want to look at some of the information about the model you’re considering. Models are built with certain functions in mind. You may see vision meaning the model can examine images, tooling or tool use meaning it may be able to perform certain actions, not just answer questions, or thinking/reasoning meaning the model can reflect on its responses and change them before providing you with a reply. Some models support multiple functions.
If Ollama is running, DEVONthink can chat with your local model through it. In LM Studio, click the Power User button at the bottom of the window, then the green prompt icon. Flip the switch to start the server and it should be available to chat with too. As a bonus tip, choose one app or the other but don’t try to run them simultaneously.
We’ve also written an introductory post on local AI, which you may want to take a look at.