Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
⚡High Performance Computing
Flag this post
What I learned building Python notebooks to run any AI model (LLM, Vision, Audio) — across CPU, GPU, and NPU
⚡High Performance Computing
Flag this post
AI Function Calling: Composing and Decomposing Functions for Complex Tasks
💻Programming
Flag this post
Topographical sparse mapping: A training framework for deep learning models
⚡High Performance Computing
Flag this post
DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models
arxiv.org·1d
💻Programming
Flag this post
I'm the author of LocalAI (the local OpenAI-compatible API). We just released v3.7.0 with full Agentic Support (tool use!), Qwen 3 VL, and the latest llama.cpp
💻Programming
Flag this post
I Taught an AI to Dream
🏛️Software Architecture Patterns
Flag this post
Show HN: Oodle – Unified Debugging with OpenSearch and Grafana
🏛️Software Architecture Patterns
Flag this post
Ranking LLMs based on 180k French votes (French government's AI arena)
🏛️Software Architecture Patterns
Flag this post
How We Built a Custom Vision LLM to Improve Document Processing at Grab
⚡High Performance Computing
Flag this post
Loading...Loading more...