Transformers

Feeds to Scour
SubscribedAll
Scoured 139 posts in 12.1 ms

Flash Attention: what it does and why it matters

 🧠Deep Learning  Content type: Blog
dev.to··DEV

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

 🤖LLMs  Content type: Code
github.com··Hacker News

InA-Probe: Instruction-Aware Active Probing for Time Series Forecasting with LLMs

 🤖Machine Learning  Content type: Academic
arxiv.org·

Real-Time AttentionBender: Granular Interactive Network Bending of Video Diffusion Transformers

 🎭Anthropic Claude  Content type: Academic
arxiv.org·

mingusb/transformer-golf: The Fully Unrolled Transformer: An experimental repository for architecture simplification and compilation. [2026]

 🤖Machine Learning  Content type: Code
github.com··Hacker News

Headroom: Cut Your LLM Token Usage by Up to 95% Without Changing Your Answers

 🤖LLMs  Content type: Blog
dev.to··DEV

RAPID: Layer-Wise Redundancy-Aware Pruning and Importance-Driven Token Merging for Efficient ViT

 🔗Parser Combinators  Content type: Academic
arxiv.org·

LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write?

 🎭Anthropic Claude  Content type: Blog
dev.to··DEV

tenurehq/precisionMemBench: Precision-aware retrieval benchmark for LLM memory systems.

 📝NLP  Content type: Code
github.com··Hacker News

Gated Bidirectional Linear Attention for Generative Retrieval

 🔍RAG  Content type: Academic
arxiv.org·

Prompt Caching

 💬Prompt Engineering
pub.towardsai.net
·

AttentionCap: Transformer Based Capacitance Matrix Learning Toward Full-Chip Extraction

 🤖Machine Learning  Content type: Academic
arxiv.org·

From Soundwaves to Stress Levels: Building an Affective Computing Pipeline with Wav2Vec 2.0

 FastAPI  Content type: Blog
dev.to··DEV

NGram-MoSE: Efficient Remote Sensing Super-Resolution via N-Gram Context and Mixture-of-Experts

 🧠Deep Learning  Content type: Academic
arxiv.org·

SLUUG Talk: Demystifying Large Language Models on Linux

 🤖Machine Learning  Content type: Code
github.com··DEV

What Actually Happens When You Send a Prompt to Claude A Full Breakdown

 💬Prompt Engineering
pub.towardsai.net
·

A Universal Dense Football Event Representation Based on TabTransformer

 🧠Deep Learning  Content type: Academic
arxiv.org·

Run Gemma-4 12B on WSL2 with llama.cpp

 📝NLP  Content type: Blog
dev.to··DEV

Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

 📈Optimization  Content type: Academic
arxiv.org·

The AI Cost Crisis: How Startups Can Survive the Tokenpocalypse

 🤖Machine Learning  Content type: Blog
dev.to··DEV

No more posts from jyunzhang's subscribed feeds.

Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help