💬 LLMs - astropanhong · Scour

Don't dethrone consciousness

🤖Machine Learning News

theintrinsicperspective.com··Hacker News

Agentic AI for Insurance Underwriting: Beyond Chatbots and Prompts

🤖AI Blog

blog.nashtechglobal.com·

A wild idea: Abstract reality using ontology

🔗Obsidian Discussion

news.ycombinator.com··Hacker News

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

🤖Machine Learning

deemwar-products.github.io··Hacker News

AIs like ChatGPT fall apart in classic 'Stroop' psychological test — and that could stand in the way of achieving artificial general intelligence

·

AI Agents Running Businesses: Andon Labs on Project Vend

🤖Machine Learning

startuphub.ai·

Can News Predict the Market? Limits of Zero-Shot Financial NLP and the Role of Explainable AI

🤖Machine Learning Academic

I used ChatGPT and Gemini side-by-side for a month on Android, and only one behaved like a senior AI tool

androidpolice.com·

I finally built the central AI hub I've been wanting, and Open WebUI made it stupidly simple

xda-developers.com·

bigattichouse/packed-twin-inference: PTI achieves ~2× throughput using a single quantized model (Q5_K_M or better) by running 4 generation streams in one batched decode call. The GPU loads model weights once per step and produces 4 predictions simultaneously. KV cache overhead is ~0.8 GiB total for all 4 streams. No draft model. No quality loss

🔥PyTorch Code

github.com··r/LocalLLaMA

Model Evaluations: Prove Your Routing Policy Actually Works

🤖Machine Learning Blog

digitalocean.com·

Show HN: Axiomax – Cryptographic proof of AI inference carbon footprint

axiomaxllc.com··Hacker News

GPU Servers for Best Performance

🤖Machine Learning

leaseweb.com··DEV

GraphInfer-Bench: Benchmarking LLM's Inference Capability on Graphs

🔥PyTorch Academic

Running Ollama on a 15W CPU sounded ridiculous until I got it working with decent results

xda-developers.com·

Appraising Artworks with Joins and LLMs (Ultorg Database UI)

ultorg.com··Hacker News

Why agentic AI needs an open inference stack

defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes

🤖AI Code

github.com··Hacker News

Claude vs GPT-4: Which AI API Is Better for Developers? (2026)

kalyna.pro··DEV

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

🤖Machine Learning

local-llm.utop.workers.dev··Hacker News

Sign up or log in to see more results

Log in to enable infinite scrolling