FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation (opens in new tab)
Device-side Large Language Models (LLMs) have grown explosively, offering stronger privacy and higher availability than their cloud-side counterparts. During LLM inference, both the model weights and the user data are valuable, and attackers may compromise the OS kernel to steal them. ARM TrustZone is the de facto hardware-based isolation technology on mobile devices, used to protect sensitive applications from a compromised OS. However, prote...
Read the original article