A thing you can do is take the most performance and correctness sensitive part of your stack and just ask a chatbot to write it for you. They will sometimes get it right! Back towards the end of 2024 Ouyang et al at Stanford attempted to benchmark how often that happened with KernelBench. DeepSeek R1 […]

Read the original article