RISE: Relay Inference and Online Scheduling for Efficient Edge-Device Collaborative Diffusion Model Services (opens in new tab)
Text-to-image diffusion models are increasingly deployed at the network edge to serve heterogeneous workloads with diverse quality and latency requirements. However, existing deployment strategies choose either large edge-side models with high fidelity but high latency or lightweight device-side models that offer speed at the cost of semantic coherence. Moreover, these approaches rarely split the denoising workload between models of different si...
Read the original article