🤖 Large Language Model - zhi.li · Scour

What Are Tokens in LLMs?

🤖AI Blog

bearisland.dev··Hacker News

Why Your LLM Gets Dumber With More Context

📊Dataset Curation

siliconopera.com·

147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens

🤖AI Blog

adambien.blog·

From Chatbot Hallucinations to Deterministic Agents: Forcing Local LLMs to Run Production-Grade…

🤖AI Blog

·

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

👁️Computer Vision

uccl-project.github.io··Hacker News

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

👁️Computer Vision

aermia.com··Hacker News

The Tech Download: Mistral's Arthur Mensch on agentic AI, chips and enterprise adoption

👁️Computer Vision News

Does ChatGPT need a psychiatrist? Similarities between human psychopathology and errors in large language models

📊Dataset Curation Academic

nature.com··Hacker News

France’s Mistral in Funding Talks at About €20 Billion Valuation

🤖AI News

·

You don't need Copilot for code completion, try this instead

mistral.ai··r/GithubCopilot

Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

🤖AI Discussion

news.ycombinator.com··Hacker News

massimo92/spark: CLI tool for serving LLMs with vLLM on NVIDIA DGX Spark. One file, zero friction.

🤖AI Code

github.com··Hacker News

lightmetal: GPU LLM Inference From a Single Java 25 JAR

🤖AI Blog

adambien.blog·

NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

🤖AI Blog

blogs.nvidia.com·

DiffusionGemma: Discrete diffusion in a large language model

👁️Computer Vision

idlemachines.co.uk··Hacker News

I ran local LLMs on my phone for a month, and now my desktop setup feels like overkill

xda-developers.com·

Report: GKE Inference Gateway delivers up to 92% faster AI responses

🤖AI Blog

cloud.google.com··Hacker News

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

zozo123.github.io··Hacker News

Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training

🤖AI Academic

Mi50 32GB / GFX906 - vLLM Qwen 3.5 Configuration for Qwen 3.5:9B AWQ-4bit

huggingface.co··r/LocalLLaMA

Log in to enable infinite scrolling