🏆 SOTA Models - foglerek · Scour

MC-PDD: Masked Corpus-Level Pretraining Data Detection for Black-Box Large Language Models

🧠LLMs Academic

Finetuning masking challenges narrow-task evaluation of cell foundation models

🧠LLMs Academic

The Inference Alpha: Maximizing Frontier Models on AMD

⚡Inference Blog

digitalocean.com·

The AI models finding 10,000 vulnerabilities are the same ones China is trying to copy. That is the problem.

🌐Open Source AI News

thenextweb.com·

Less-relevant results

Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models

✍️Prompt Engineering

lesswrong.com·

Task-Seeded Synthetic Q&A Generation for Nemotron Pretraining

🌐Open Source AI Blog

huggingface.co·

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

⚡Inference News Blog

developer.nvidia.com·

A generalist biomedical vision-language model via multi-CLIP knowledge distillation

🧠LLMs Academic

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

🌐Open Source AI Discussion

news.ycombinator.com··Hacker News

bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss

⚡Inference Code

github.com··r/LocalLLaMA

If Claude Fable stops helping you, you’ll never know

🎛️Fine-tuning

simonwillison.net··Hacker News

SAR updates its first homegrown AI model - Azərtac

🌐Open Source AI

On-device AI agents hit a hard memory limit. Apple's new architecture routes around it.

venturebeat.com·

BRAND ANALYSIS: I IS FOR INSTAGRAM

⚡Inference Blog

nemesisglobal.substack.com

Tracing Eval-Awareness Emergence Through Training of OLMo 3

🎛️Fine-tuning

lesswrong.com·

To discover new physics, AI may need to 'unlearn' the old one

Nvidia Nemotron 3 Ultra

🎛️Fine-tuning

research.nvidia.com··Hacker News

Mix, Don't Pick: Why Synthetic Corpus Composition Matters for Time Series Foundation Model Pretraining

🧠LLMs Academic

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

🎛️Fine-tuning

aermia.com··Hacker News

Sorry, not sorry (Ideogram jailbroken in 1 easy step)

📐Context Engineering

gist.github.com··r/StableDiffusion

Log in to enable infinite scrolling