A Two-Stage GPU Kernel Tuner Combining Semantic Refactoring and Search-Based Optimization
arxiv.org·1d
Using Local LLMs to Discover High-Performance Algorithms
towardsdatascience.com·2d
meta-pytorch/segment-anything-fast: A batched offline inference oriented version of segment-anything
github.com·57m
Making a Language
thunderseethe.dev·11h
Streamlining CUB with a Single-Call API
developer.nvidia.com·12h
Playing with GPT-3, LangChain, and the OpenAI Embeddings API
shruggingface.com·1d
Hippocampus model implementing a Turing machine
pub.towardsai.net·5h
Why AI Needs GPUs and TPUs: The Hardware Behind LLMs
blog.bytebytego.com·2d
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
machinelearning.apple.com·1d
Co-optimization Approaches For Reliable and Efficient AI Acceleration (Peking University et al.)
semiengineering.com·16h
TypeScript levels up with type stripping
infoworld.com·51m
Loading...Loading more...