This is an in-depth post on bugs and how to prevent them in AI software and AI compilers specifically. I was the software lead for TPUv3 at Google and I’ve worked on a variety of AI compilers and projects across Google, Nvidia, Amazon and Facebook.

Zero is a hard number

In my estimation, XLA has the most comprehensive AI test suite of any ML compiler, so I heartily recommend XLA for mission-critical AI. XLA is used for most Google AI and has been for a decade. XLA is highly reliable. Yet, even XLA has bugs that escape into the wild for customers to encounter. The number of bugs is not zero, not even for XLA.

Anthropic published this report, diagnosing a bug in an XLA op as one of the causes of the A…

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help