Run LLMs Locally
ikangai.com·1d·
Discuss: Hacker News
⚙️Query Compilers
Flag this post
Beyond Pinecone: A Developer's Deep Dive into the Top 10 Vector Databases for GenAI in 2024
dev.to·4h·
Discuss: DEV
DataFusion
Flag this post
The Production Generative AI Stack: Architecture and Components
thenewstack.io·1d
📊Data Lineage
Flag this post
Co-Optimizing GPU Architecture And SW To Enhance Edge Inference Performance (NVIDIA)
semiengineering.com·1d
📈Performance Profiling
Flag this post
Understanding multi GPU Parallelism paradigms
datta0.github.io·1d·
Discuss: Hacker News
🔢NumPy
Flag this post
Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.net·3d·
Discuss: DEV
💾Cache Optimization
Flag this post
Inside Pinecone: Slab Architecture
pinecone.io·2d·
Discuss: Hacker News
🧮Apache Calcite
Flag this post
Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.ai·2d·
Discuss: Hacker News
💾Cache Optimization
Flag this post
Why Multimodal AI Broke the Data Pipeline — And How Daft Is Beating Ray and Spark to Fix It
hackernoon.com·4d
🔢NumPy
Flag this post
Unlock 2x better price-performance with Axion-based N4A VMs, now in preview
cloud.google.com·1d
🏛️Lakehouse Architecture
Flag this post
Physics informed machine learning based predictive control for intelligent operation of edge datacenters
sciencedirect.com·5d
🎮Reinforcement Learning
Flag this post
Optimizing Datalog for the GPU
dl.acm.org·1d·
Discuss: Lobsters
DataFusion
Flag this post
Enabling Trillion-Parameter Models on AWS EFA
research.perplexity.ai·2d·
Discuss: Hacker News
🌊Apache Flink
Flag this post
Query Compilation Isn't as Hard as You Think
databasearchitects.blogspot.com·1d·
⚙️Query Compilers
Flag this post
Which Chip Is Best?
blog.confident.security·21h·
Discuss: Hacker News
📈Performance Profiling
Flag this post
Hydra: Dual Exponentiated Memory for Multivariate Time Series Analysis
arxiv.org·3d
🧊Iceberg Tables
Flag this post
Supercharging the ML and AI Development Experience at Netflix
netflixtechblog.com·2d
🌊Apache Flink
Flag this post
The state of SIMD in Rust in 2025
shnatsel.medium.com·2d·
SIMD Optimization
Flag this post
Why Code Execution is Eating Tool Registries
levelup.gitconnected.com·16h·
Discuss: r/programming
DataFusion
Flag this post
We found embedding indexing bottleneck in the least expected place: JSON parsing
nixiesearch.substack.com·3d·
Discuss: Substack
📋Tokei
Flag this post