SOTA Models

Feeds to Scour
SubscribedAll
Scoured 279 posts in 4.9 ms

MC-PDD: Masked Corpus-Level Pretraining Data Detection for Black-Box Large Language Models

 🧠LLMs  Content type: Academic
arxiv.org·

Finetuning masking challenges narrow-task evaluation of cell foundation models

 🧠LLMs  Content type: Academic
biorxiv.org·

The Inference Alpha: Maximizing Frontier Models on AMD

 Inference  Content type: Blog
digitalocean.com·

The AI models finding 10,000 vulnerabilities are the same ones China is trying to copy. That is the problem.

 🌐Open Source AI  Content type: News
thenextweb.com·
Less-relevant results

Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models

 ✍️Prompt Engineering
lesswrong.com·

Task-Seeded Synthetic Q&A Generation for Nemotron Pretraining

 🌐Open Source AI  Content type: Blog
huggingface.co·

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

 Inference  Content type: News  Content type: Blog
developer.nvidia.com·

A generalist biomedical vision-language model via multi-CLIP knowledge distillation

 🧠LLMs  Content type: Academic
nature.com·

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

 🌐Open Source AI  Content type: Discussion

bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss

 Inference  Content type: Code
github.com··r/LocalLLaMA

If Claude Fable stops helping you, you’ll never know

 🎛️Fine-tuning
simonwillison.net··Hacker News

SAR updates its first homegrown AI model - Azərtac

 🌐Open Source AI
azertag.az·

On-device AI agents hit a hard memory limit. Apple's new architecture routes around it.

 🤖AI Agents
venturebeat.com·

BRAND ANALYSIS: I IS FOR INSTAGRAM

 Inference  Content type: Blog

Tracing Eval-Awareness Emergence Through Training of OLMo 3

 🎛️Fine-tuning
lesswrong.com·

To discover new physics, AI may need to 'unlearn' the old one

 🤖AI Agents
phys.org·

Nvidia Nemotron 3 Ultra

 🎛️Fine-tuning

Mix, Don't Pick: Why Synthetic Corpus Composition Matters for Time Series Foundation Model Pretraining

 🧠LLMs  Content type: Academic
arxiv.org·

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

 🎛️Fine-tuning
aermia.com··Hacker News

Sorry, not sorry (Ideogram jailbroken in 1 easy step)

 📐Context Engineering

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help