Transformers, CNNs, Model Design, Deep Learning
Optimizing LLMs for Performance and Accuracy with Post-Training Quantization
developer.nvidia.com·3d
LLM Inference: Core Bottlenecks Imposed By Memory, Compute Capacity, Synchronization Overheads (NVIDIA)
semiengineering.com·1d
Good Learners Think Their Thinking: Generative PRM Makes Large Reasoning Model More Efficient Math Learner
arxiv.org·2d
Mamba-based Efficient Spatio-Frequency Motion Perception for Video Camouflaged Object Detection
arxiv.org·2d
Loading...Loading more...