The state-of-the-art in LLM streaming is surprisingly bad.

Say you go to Google’s Gemini UI, start a conversation, and then switch to a new chat. When you come back to your original chat, the stream will be completely dead, until you refresh the page. Claude does a little better, but not by much: if you start a conversation and then refresh mid-stream, it will send you back to the home page! With both providers, the only way to continue your conversation is to wait and guess until you think the stream is done, then refresh the page.

We wanted to do better at Stardrift. Actually – we had to. Our tasks run for minutes at a time: our AI travel agent does dozens of tool calls, searches and then does more data enrichment mid-stream as it answers user queries. We need to show the user …

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help