🤖 Transformers - jyunzhang · Scour

Flash Attention: what it does and why it matters

🧠Deep Learning Blog

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖LLMs Code

github.com··Hacker News

InA-Probe: Instruction-Aware Active Probing for Time Series Forecasting with LLMs

🤖Machine Learning Academic

Real-Time AttentionBender: Granular Interactive Network Bending of Video Diffusion Transformers

🎭Anthropic Claude Academic

mingusb/transformer-golf: The Fully Unrolled Transformer: An experimental repository for architecture simplification and compilation. [2026]

🤖Machine Learning Code

github.com··Hacker News

Headroom: Cut Your LLM Token Usage by Up to 95% Without Changing Your Answers

🤖LLMs Blog

RAPID: Layer-Wise Redundancy-Aware Pruning and Importance-Driven Token Merging for Efficient ViT

🔗Parser Combinators Academic

LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write?

🎭Anthropic Claude Blog

tenurehq/precisionMemBench: Precision-aware retrieval benchmark for LLM memory systems.

📝NLP Code

github.com··Hacker News

Gated Bidirectional Linear Attention for Generative Retrieval

🔍RAG Academic

Prompt Caching

💬Prompt Engineering

pub.towardsai.net

·

AttentionCap: Transformer Based Capacitance Matrix Learning Toward Full-Chip Extraction

🤖Machine Learning Academic

From Soundwaves to Stress Levels: Building an Affective Computing Pipeline with Wav2Vec 2.0

⚡FastAPI Blog

NGram-MoSE: Efficient Remote Sensing Super-Resolution via N-Gram Context and Mixture-of-Experts

🧠Deep Learning Academic

SLUUG Talk: Demystifying Large Language Models on Linux

🤖Machine Learning Code

github.com··DEV

What Actually Happens When You Send a Prompt to Claude A Full Breakdown

💬Prompt Engineering

pub.towardsai.net

·

A Universal Dense Football Event Representation Based on TabTransformer

🧠Deep Learning Academic

Run Gemma-4 12B on WSL2 with llama.cpp

📝NLP Blog

Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

📈Optimization Academic

The AI Cost Crisis: How Startups Can Survive the Tokenpocalypse

🤖Machine Learning Blog

No more posts from jyunzhang's subscribed feeds.

Scour all 25257 feeds Learn more about Feeds

Sign up or log in to see more results

Log in to enable infinite scrolling