🧠 LLMs - abhik

Neo-X7/Neo-AI: A fully offline AI assistant powered by Ollama. Stores and retrieves conversations using SQLite + LanceDB vector search. No cloud. No API keys. Runs entirely on your machine.

🤖AI Engineering Code

github.com··DEV

How J.A.R.V.I.S. Became the Smartest Mind on Earth — What is an LLM?

🤖AI Engineering Blog

medium.com·

Large companies can add a local LLM filter layer to considerably reducing their AI costs

🤖AI Engineering

umrashrf.github.io··Hacker News

The Shibboleth Effect: Auditing the Cross-Lingual Distributional Skew of Large Language Models

🤖AI Engineering Academic

arxiv.org·

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

🤖AI Engineering

deemwar-products.github.io··Hacker News

Why Shrinking an AI Model Often Makes It More Useful

🤖AI Engineering

siliconopera.com·

Running Ollama on a 15W CPU sounded ridiculous until I got it working with decent results

🤖AI Engineering

xda-developers.com·

2x GH200 for LLM inference, Part 2: vLLM, DeepSeek V4 Flash, and MTP

🚀Inference Blog

dnhkng.github.io·

MLPerf and the rise of latency-aware LLM benchmarking

🤖AI Engineering

edn.com·

LLM AI Chatbots are letting me down every single day

🤖AI Engineering

umrashrf.github.io··Hacker News

Deep Learning Weekly: Issue 458

🤖AI Engineering

deeplearningweekly.com·

Alignment Defends LLMs from Property Inference Attacks

🤖AI Engineering Academic

arxiv.org·

I built an open-source persistent memory layer for AI coding agents

🤖AI Engineering Code

github.com··r/GithubCopilot

LLM Research Papers: The 2026 List (January to May)

🤖AI Engineering News

magazine.sebastianraschka.com

··Hacker News

MechLens: Late Crystallization of Factual Knowledge Explains Intervention Effectiveness in Language Models

🤖AI Engineering Academic

arxiv.org·

techjarves/Portable-AI-USB: A 100% offline, fully portable, zero-trace AI (Ollama + Llama 3 + AnythingLLM) that runs natively from a USB drive on Windows and Mac.

🤖AI Engineering Code

github.com·

How we fight GPU scarcity without compromise

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens

Using local LLMs for agentic coding

Neo-X7/Neo-AI: A fully offline AI assistant powered by Ollama. Stores and retrieves conversations using SQLite + LanceDB vector search. No cloud. No API keys. Runs entirely on your machine.

How J.A.R.V.I.S. Became the Smartest Mind on Earth — What is an LLM?

Large companies can add a local LLM filter layer to considerably reducing their AI costs

The Shibboleth Effect: Auditing the Cross-Lingual Distributional Skew of Large Language Models

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

Why Shrinking an AI Model Often Makes It More Useful

Running Ollama on a 15W CPU sounded ridiculous until I got it working with decent results

2x GH200 for LLM inference, Part 2: vLLM, DeepSeek V4 Flash, and MTP

MLPerf and the rise of latency-aware LLM benchmarking

LLM AI Chatbots are letting me down every single day

Deep Learning Weekly: Issue 458

Alignment Defends LLMs from Property Inference Attacks

I built an open-source persistent memory layer for AI coding agents

LLM Research Papers: The 2026 List (January to May)

MechLens: Late Crystallization of Factual Knowledge Explains Intervention Effectiveness in Language Models

techjarves/Portable-AI-USB: A 100% offline, fully portable, zero-trace AI (Ollama + Llama 3 + AnythingLLM) that runs natively from a USB drive on Windows and Mac.