DeepSeek V4 - almost on the frontier, a fraction of the price (opens in new tab)

Covered by indiehacker.news, Second ThoughtsDiscussed on Hacker News

Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) last December. They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, DeepSeek-V4-Pro and DeepSeek-V4-Flash. Both models are 1 million token context Mixture of Experts. Pro is 1.6T total parameters, 49B active. Flash is 284B total, 13B active. They're using the standard MIT license. I think this makes DeepSeek-V4-Pro the new largest open weights model. It's larger than Kimi ...

Read the original article

Sign in to keep reading the full article.

Sign Up Log In

Covered in 2 articles

indiehacker.news·

#048 - Anthropic buys OpenAI's SDK shop for $300M, Musk's $134B suit dies in 2 hours

Second Thoughts·

From Compute Overhang to Compute Crunch

Discussed on Hacker News