Model Quantization, Inference Optimization, GGUF Format, Privacy-preserving AI

MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines
arxiv.org·1d
🧮Z3 Applications
Just How Resilient Are Large Language Models?
rdrocket.com·49m·
Discuss: Hacker News
💾Persistence Strategies
Confidential LLM Inference: Performance and Cost Across CPU and GPU TEEs
arxiv.org·11h
Performance Mythology
Friday 24 October - 11am
informatics.ed.ac.uk·2h
💻Programming languages
Markov chains are the original language models
dev.to·2h·
Discuss: DEV
🔗Monadic Parsing
LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures
arxiviq.substack.com·2d·
Discuss: Substack
🧠Learned Codecs
Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding
arxiv.org·1d
🌊Streaming Algorithms
Research Round Up: On Anonymization -Creating Data That Enables Generalization Without Memorization
hackernoon.com·2d
🛡️Differential Privacy
LLM Features That Ship: Extraction, Generation, and Classification
alex-jacobs.com·1d·
Discuss: Hacker News
🌀Brotli Internals
Scaling Speculative Decoding with Lookahead Reasoning
hao-ai-lab.github.io·1d
🔧Reed-Solomon Decoders
CCQA: Generating Question from Solution Can Improve Inference-Time Reasoning in SLMs
arxiv.org·11h
🌳Context free grammars
EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs
arxiv.org·1d
🧮Vector Embeddings
Local-deepthink – perform ultra long thinking using a society of agents (QNN)
github.com·1d·
Discuss: Hacker News
Incremental Computation
Preserving Node-level Privacy in Graph Neural Networks
arxiv.org·1d
🛡️Differential Privacy
Secure Confidential Business Information When Sharing Machine Learning Models
arxiv.org·1d
🔒Privacy Preserving
Generative AI Myths, Busted: An Engineers’s Quick Guide
towardsdatascience.com·21h
Proof Automation
Evaluating the Effectiveness and Scalability of LLM-Based Data Augmentation for Retrieval
arxiv.org·1d
🔍Information Retrieval
Catching the Details: Self-Distilled RoI Predictors for Fine-Grained MLLM Perception
arxiv.org·1d
📊Learned Metrics