DPIFrame: A Dual-Level Parallelism Acceleration Framework for CTR Model Inference (opens in new tab)
Deep learning technology has enhanced the ability of Click-through rate (CTR) prediction models to learn features and improve prediction accuracy. However, it is challenging to deploy CTR models on GPU smoothly and perform inference efficiently, because there is a huge mismatch between the serial computational pattern and the parallel model structure. In this paper, we propose DPIFrame, the first dual parallelizable framework to accelerate CTR m...
Read the original article