CloakLM: Obfuscating GPU Memory Layout to Mitigate Model Ex-filtration for Serving (opens in new tab)
Large foundation models deployed on third-party and shared accelerator infrastructure face a practical risk of model exfiltration that existing defenses do not fully address. In common serving deployments, model providers control the VM or bare-metal serving stack but not the surrounding hardware substrate. The host to GPU interconnect, accelerator fabric, and neighboring infrastructure components remain outside the tenant's trust boundary and h...
Read the original article