✨ Model optimizations in LLMs - pleto · Scour

STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control

🔧Systems-level optimizations for LLM serving Academic

Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillatio

🔢Quantization of LLMs Academic

LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models

🧠Large Language Models (LLMs) Academic

BenDi: An Energy-Efficient Quasi-Stochastic Systolic Architecture for Edge Bioelectronics

📊AI Performance Profiling Academic

FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models

🧠Large Language Models (LLMs) Academic

SNN-MLIR: An MLIR Dialect for Compiling Neuromorphic SNNs from NIR to Bare-Metal C

📊AI Performance Profiling Academic

Semantic Grading of Written Answers in Low-Resource Language Bangla Using a Fine-Tuned Lightweight Language Model

🔢Quantization of LLMs Academic

ColBERTSaR: Sparsified ColBERT Index via Product Quantization

🔍Retrieval-augmented generation Academic

SpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM Serving

🧠Large Language Models (LLMs) Academic

MOTOR: Learning ID-free Item Representation with Token Crossing for Embedding-based Multimodal Recommendation

🧠Large Language Models (LLMs) Academic

Automated IEP Generation from Traditional Chinese Parent-Teacher Interviews via Corpus-Grounded Feature Diffusion

🧠Large Language Models (LLMs) Academic

Beyond Generative Decoding: Discriminative Hidden-State Readout from a Native Omni-Modal LLM for Multimodal Sentiment Analysis

🔢Quantization of LLMs Academic

Benchmarking Neural Speech Compression from a Rate-Distortion Perspective

🧠Large Language Models (LLMs) Academic

EditSSC: Toward Editable Semantic Occupancy Scenes with Unconditional Diffusion Models

🔍Retrieval-augmented generation Academic

Large-scale empirical tuning and comparison of default optimizers for variational inference

🧠Large Language Models (LLMs) Academic

RedditPersona: A Modular Framework for Community-Conditioned LLM Adaptation from Reddit

🔢Quantization of LLMs Academic

Next-Token Prediction Learns Generalisable Representations of Sleep Physiology

🧠Large Language Models (LLMs) Academic

EgoPressDiff: Multimodal Video Diffusion for Egocentric UV-Domain Hand-Pressure Estimation

⚡Real-time AI Systems Academic

BioVid: Autoregressive Video Generation with Biological Behavior Semantic Comprehension

🧠Large Language Models (LLMs) Academic

Learned Subspace Compression for Communication-Efficient Pipeline Parallelism

🌐Distributed LLM Systems Academic

Log in to enable infinite scrolling