Anukari on the CPU (part 2: CPU optimization)
anukari.com·3h·
Discuss: Hacker News
Flag this post

Captain’s Log: Stardate 79317.7

In part 1 of this series of posts, I mentioned that I was surprised to find that naively running my GPU code on the CPU was only 5x slower, when I thought it would be 100x slower. In this post I will explain how I ended up making the CPU implementation much faster than on the GPU.

First approach: spot-vectorization

As mentioned in part 1, I got the original GPU code compiled for the CPU, and then wrote a simple driver to call into this code and run the simulation (in lieu of the code that set up and invoked the GPU kernel).

As you might imagine, Anukari, being a 3D physics simulation, does a lot of ari…

Similar Posts

Loading similar posts...