🤖 AI - ahmedelgabri · Scour

Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale

daft.ai·11h·

Discuss: Hacker News

⚡High Performance Computing

Flag this post

What I learned building Python notebooks to run any AI model (LLM, Vision, Audio) — across CPU, GPU, and NPU

reddit.com·1d·

Discuss: r/programming

⚡High Performance Computing

Flag this post

Ariadne: A Controllable Framework for Probing and Extending VLM Reasoning Boundaries

arxiv.org·23h

⚡High Performance Computing

Flag this post

AI Function Calling: Composing and Decomposing Functions for Complex Tasks

lightcapai.medium.com·9h·

Discuss: Hacker News

💻Programming

Flag this post

Topographical sparse mapping: A training framework for deep learning models

sciencedirect.com·7h·

Discuss: Hacker News

⚡High Performance Computing

Flag this post

Choosing the best AI coding agent for Bitrise

bitrise.io·6h·

Discuss: Hacker News

🏛️Software Architecture Patterns

Flag this post

OpenAI signs massive AI compute deal with Amazon

arstechnica.com·1d·

Discuss: Hacker News

⚡High Performance Computing

Flag this post

Continuous Autoregressive Language Models

shaochenze.github.io·1h·

Discuss: Hacker News

⚡High Performance Computing

Flag this post

I'm the author of LocalAI (the local OpenAI-compatible API). We just released v3.7.0 with full Agentic Support (tool use!), Qwen 3 VL, and the latest llama.cpp

reddit.com·2d·

Discuss: r/LocalLLaMA

💻Programming

Flag this post

We need to give LLMs human-like vision

matml.bearblog.dev·11h·

Discuss: Hacker News, Hacker News

⚡High Performance Computing

Flag this post

Planning > Agents: Getting Reliable Code from LLMs

repoprompt.com·2h·

Discuss: Hacker News

🏛️Software Architecture Patterns

Flag this post

The Case That A.I. Is Thinking

newyorker.com·1d·

Discuss: Hacker News

⚡High Performance Computing

Flag this post

AI Uses Functions to Fetch Real Data (Not Just Chat)

farukalpay.substack.com·15h·

Discuss: Substack

💻Programming

Flag this post

Writing an LLM from scratch, part 26 – evaluating the fine-tuned model

gilesthomas.com·1d·

Discuss: Hacker News

💻Programming

Flag this post

Show HN: ReadMyMRI DICOM native preprocessor with multi model consensus/ML pipes

github.com·5h·

Discuss: Hacker News

⚡High Performance Computing

Flag this post

I Taught an AI to Dream

blog.minibase.ai·11h·

Discuss: Hacker News

🏛️Software Architecture Patterns

Flag this post

Show HN: Oodle – Unified Debugging with OpenSearch and Grafana

blog.oodle.ai·12h·

Discuss: Hacker News

🏛️Software Architecture Patterns

Flag this post

Loquetier: A Virtualized Multi-LoRA Framework for Unified LLM Fine-tuning and Serving

arxiv.org·23h

💻Programming

Flag this post

Open Source Context-Aware PII Classifier

corp.roblox.com·8h·

Discuss: Hacker News

⚡High Performance Computing

Flag this post

Windsurf Codemaps: Understand Code, Before You Vibe It

cognition.ai·10h·

Discuss: Hacker News, Hacker News

⚙️DevOps Practices

Flag this post

Loading more...