Deedy Das’ Post
Partner at Menlo Ventures | Investing in AI startups!
2d
Both Cursor and Cognition (Windsurf) new models today are speculated to be built on Chinese base models! – Cognition SWE-1.5 seems to be a customized (fine-tuned / RL) Zhipu’s GLM 4.6 model running on Cerebras. – Cursor Composer has Chinese reasoning traces. Interesting to see mature AI products move to owning the model!
Could also be corrupted Unicode. I’ve seen occasional models trying to print things in the terminal that don’t accept UTF-8 can get rendered as Chinese characters
We’re seeing the same trend in our conversations with prospects using the PipesHub platform. Many of them prefer using open-source base model on …
Deedy Das’ Post
Partner at Menlo Ventures | Investing in AI startups!
2d
Both Cursor and Cognition (Windsurf) new models today are speculated to be built on Chinese base models! – Cognition SWE-1.5 seems to be a customized (fine-tuned / RL) Zhipu’s GLM 4.6 model running on Cerebras. – Cursor Composer has Chinese reasoning traces. Interesting to see mature AI products move to owning the model!
Could also be corrupted Unicode. I’ve seen occasional models trying to print things in the terminal that don’t accept UTF-8 can get rendered as Chinese characters
We’re seeing the same trend in our conversations with prospects using the PipesHub platform. Many of them prefer using open-source base model on their own infrastructure instead of relying on black-box APIs. Just yesterday, in a call with an early Databricks employee, we talked about how enterprises are increasingly leaning toward open-source platforms, whether it’s for LLMs, enterprise search, workflow automation, or agent builders.
Maybe it has something to do with the fact that Chinese characters are more information-dense than English tokens.
Zhipu’s GLM 4.6 demonstrates exceptional performance in agentic tasks. In fact, we are leveraging a fine-tuned variant of it within our platform. Currently we are building an AI-native course creation system designed for modern LMS platforms that enables organizations to generate tailor-made explanatory videos and micro-lessons with just a few contextual prompts. Our mission is to help organizations deliver personalized training and upskilling experiences for their employees in this rapidly evolving era of AI.
Interesting choice to use zai
Deedy Das thanks for the comment about the actual models used. The internal research team commenting on Hackernews was very evasive about the base model used which was another datapoint for me that it was very likely a Chinese model. Strangely enough, if they’d come right out and said it was GLM, I would assume it would be worth my time trying their new model because I know just how strong GLM (4.6) is rather than just sticking with Codex/Claude Code. The Qwen series especially are a great set of models to start fine-tuning on because they are very strong and come in a diversity of sizes and modalities.
In a way this is hilarious to a bit since I remember reading somewhere months ago that this kind of scenario was inevitable. Any high performing open source model (most of which are from China nowadays) is going to become more and more adopted and customized by many frontier AI companies. Which in itself is going to become a soft moat due to growing reliance on these open source models and their updates. OpenAI and Anthropic are closed and priprietary models and will continue to be so, their business revolves around that. Meta was the only major company providing a great open source contender and has the resources to run their own training + compute farms. I hoped Meta would continue on with that but that is looking lackluster with the llama 4 debacle. Maybe nvidia can be the one to push meta to release llama 5. The super and ultra nemotron models are amazing in terms of their performance; nvidia took llama 3.3 and turboed their performance via additional finetuning and other tweaks.
What are the other options Lesser number of effective american models are open source It is costly to train model from scratch If china provides with high quality open source models then is it good to praise them or cockly declare that chinese models have been finetuned. Rest of the world is acting captalistic for their llm models which is not good for global hagemony. All the thanks to chinese opensource developer community I think hats off
It’s so simple. A great AI product experience is because of the model training.
Same here! We’ll soon completely decommission Anthropic. Especially Claude 4.5…Expensive and unreliable at best. I was once recommending, but now I don’t want to use them anymore. My bootstrapped company has better SLAs than Claude, and what used to cost $5 a few months ago is now $25.
To view or add a comment, sign in