📱 Edge AI - nmarshall · Scour

Position Paper: From Edge AI to Adaptive Edge AI 🏗️AI Infrastructure

arxiv.org·5d

Google Released Gemma 4 with a Focus On Local-First, On-Device AI Inference 🏠Self-hosted AI

infoq.com·1d

The Rabbit Hole that is Model Quantization 💻Local LLMs

medium.com

·5h

Ultra-efficient on-device AI, now even faster - MiniCPM 🎚️Voice AI Systems

producthunt.com·2d

Fast Isn’t Fast Enough: Redefining Metrics for Edge AI ⚡Hardware Acceleration

semiengineering.com·5d

Strengthening enterprise governance for rising edge AI workloads 🧠AI

artificialintelligence-news.com·1d

Your developers are already running AI locally: Why on-device inference is the CISO’s new blind spot 🏠Self-hosted AI

venturebeat.com·2d

Radar Reference Platform Improves Identification in Edge AI 🏗️AI Infrastructure

embedded.com·4d

F&S M.2 AI Accelerator Uses NXP Ara-240 for Edge Inference Workloads ⚡Hardware Acceleration

linuxgizmos.com·5d

AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators 🏗️AI Infrastructure

arxiv.org·1d

Hardware Utilization and Inference Performance of Edge Object Detection Under Fault Injection 🏗️AI Infrastructure

arxiv.org·1d

A-IO: Adaptive Inference Orchestration for Memory-Bound NPUs 🏠Self-hosted AI

arxiv.org·1d

Networking-Aware Energy Efficiency in Agentic AI Inference: A Survey 🏗️AI Infrastructure

arxiv.org·5d

Modality-Aware Zero-Shot Pruning and Sparse Attention for Efficient Multimodal Edge Inference 🏗️AI Infrastructure

arxiv.org·2d

Multi-Turn Reasoning LLMs for Task Offloading in Mobile Edge Computing 💻Local LLMs

arxiv.org·6d

LoDAdaC: a unified local training-based decentralized framework with adaptive gradients and compressed communication 🤝Federated Learning

arxiv.org·1d

EdgeFlow: Fast Cold Starts for LLMs on Mobile Devices ⚙️LLVM

arxiv.org·2d

From LLM to Silicon: RL-Driven ASIC Architecture Exploration for On-Device AI Inference ⚡Hardware Acceleration

arxiv.org·5d

Reliable Online Resource Allocation for Multi-User Semantic Communications: A Constraint Bayesian Optimization Approach 🌐Distributed Systems

arxiv.org·1d

SHIELD: A Segmented Hierarchical Memory Architecture for Energy-Efficient LLM Inference on Edge NPUs 💻Local LLMs

arxiv.org·5d