๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿง  LLM Inference

Quantization, Attention Mechanisms, Batch Processing, KV Caching

When LLM Meets Time Series: Can LLMs Perform Multi-Step Time Series Reasoning and Inference
arxiv.orgยท22h
๐Ÿ†LLM Benchmarking
Understanding Transformers Using a Minimal Example
rti.github.ioยท10hยท
Discuss: Hacker News, r/programming
๐Ÿ“ŠEmbeddings
Yet Unnoticed in LSTM: Binary Tree Based Input Reordering, Weight Regularization, and Gate Nonlinearization
arxiv.orgยท22h
๐Ÿ“‰Embeddings Optimization
Moe Inference Economics from First Principles
tensoreconomics.comยท23hยท
Discuss: Hacker News
๐Ÿ“ŠModel Serving Economics
Solving Deepfakes with Traces, Frequency, and Attention!
pub.towardsai.netยท13h
๐Ÿ”ขBitNet
Natural Latents: Latent Variables Stable Across Ontologies
lesswrong.comยท1h
๐ŸงฉTypes
SCOUT: Toward Sub-Quadratic Attention via Segment Compression for Optimized Utility in Transformers
arxiv.orgยท22h
๐Ÿ—œ๏ธVector Compression
Convolutional Denoising Autoencoders for Diagnostic Images
haydenramm.bearblog.devยท14h
๐Ÿ”ขBitNet
Variational Uncertainty Decomposition for In-Context Learning
arxiv.orgยท22h
๐Ÿ”Information Retrieval
Brain-wide representations of prior information in mouse decision-making
nature.comยท10h
๐Ÿ”AI Interpretability
Show HN: Higher-order transform streams: 10x faster AI with recursive prompts
timetler.comยท10hยท
Discuss: Hacker News, r/programming
๐Ÿ’พPrompt Caching
LLM Encoder vs. Decoder: Robust Detection of Chinese AI-Generated Text with LoRA
arxiv.orgยท22h
๐Ÿ“Text Compression
Knowledge-integrated AutoEncoder Model
arxiv.orgยท22h
๐Ÿ“ŠEmbeddings
Multitask Battery Management with Flexible Pretraining
arxiv.orgยท22h
๐Ÿ’พPrompt Caching
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic
arxiv.orgยท22h
๐ŸงฎSMT Solvers
An Efficient GNNs-to-KANs Distillation via Self-Attention Dynamic Sampling with Potential for Consumer Electronics Edge Deployment
arxiv.orgยท22h
๐Ÿ—œ๏ธZstd
How to Vibe Code Effectively
ibrahimahmed.caยท5hยท
Discuss: Hacker News
๐Ÿช„Prompt Engineering
Fantastic Pretraining Optimizers and Where to Find Them
arxiviq.substack.comยท14hยท
Discuss: Substack
๐Ÿ†LLM Benchmarking
Pruning Weights but Not Truth: Safeguarding Truthfulness While Pruning LLMs
arxiv.orgยท22h
๐Ÿ†LLM Benchmarking
The Term "Non-Deterministic" and LLMs
vishalbakshi.github.ioยท1hยท
Discuss: Hacker News
๐Ÿ•ณLLM Vulnerabilities
Loading...Loading more...
AboutBlogChangelogRoadmap