🧠 LLM Inference - emschwartz · Scour

How we cut Vertex AI latency by 35% with GKE Inference Gateway

cloud.google.com·4d

🧠Inference Serving

Bulk RRAM Could Be AI’s Memory Wall Solution

spectrum.ieee.org·2d·

Discuss: r/hardware

🧠Memory Hierarchy Design

Unlocking Knowledge with AI

zappable.com·2d

🛡️AI Security

jordivillar.com·3d

💾Persistence Strategies

Using Chisanbop with Memory Palaces

forum.artofmemory.com·2d

🏊Memory Pools

Lekh AI v2.0 is out – Big offline AI update, Better memory and llama GGUF models support. Mac app coming next week.

apps.apple.com·2d·

Discuss: r/LocalLLaMA

How can computing for AI and other demands be more energy efficient?

techxplore.com·3d

Tutorial on Agentic Engine

pori.vanangamudi.org·2d·

Discuss: r/LocalLLaMA

🛡️Open Policy Agent

From Chunks to Connections: The Case for Graph RAG

pub.towardsai.net·2d

🔄LLM RAG Pipelines

The nature of LLM algorithmic progress

lesswrong.com·5d

🏆LLM Benchmarking

Reinforcement Inference: Leveraging Uncertainty for Self-Correcting Language Model Reasoning

arxiv.org·1d

🏗️LLM Infrastructure

LLMs are Getting a Lot Better and Faster at Finding and Exploiting Zero-Days

schneier.com·2d

🕳LLM Vulnerabilities

Adaptive Retrieval helps Reasoning in LLMs -- but mostly if it's not used

arxiv.org·1d

🔄LLM RAG Pipelines

State of AI: Bi-Annual Snapshot

iconiqcapital.com·1d

World Models and the Data Problem in Robotics

joeljang.github.io·1d·

Discuss: Hacker News

Nexus AI – A Chrome extension that understands and summarizes the page

nexusbrowserai.com·1d·

Discuss: Hacker News

I Let AI Agents Train Their Own Models. Here's What Actually Happened.

hamzamostafa.com·2d·

Discuss: Hacker News

A Language For Agents

lucumr.pocoo.org·2d·

Discuss: Lobsters, Hacker News, Hacker News

💻Programming languages

AI is a High Pass Filter for Software Delivery

bryanfinster.substack.com·1d·

Discuss: Substack

🔍AI Interpretability

Mathematical Resolution of P vs NP through Informational Noise Subtraction and Linear O(n) Mapping

zenodo.org·4d·

Discuss: Hacker News

🧮SMT Solvers

Loading more...