Our newest model: Chandra (OCR)
🏎️TensorRT
Flag this post
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
arxiv.org·4d
🧩Attention Kernels
Flag this post
To grow, we must forget… but now AI remembers everything
doc.cc·19h
🧩Attention Kernels
Flag this post
After distractions, rotating brain waves may help thought circle back to the task
medicalxpress.com·1d
⚡Flash Attention
Flag this post
<p>**Abstract:** Traumatic brain injury (TBI) significantly increases the long-term risk of Alzheimer’s disease (AD). Early identification of biomarkers predict...
freederia.com·2d
🧠BF16
Flag this post
A unified threshold-constrained optimization framework for consistent and interpretable cross-machine condition monitoring
sciencedirect.com·13h
⏱️Benchmarking
Flag this post
Sparse Adaptive Attention “MoE”: How I Solved OpenAI’s $650B Problem With a £700 GPU
⚡Flash Attention
Flag this post
🧠 Soft Architecture (Part B): Emotional Timers and the Code of Care (Part 5 of the SaijinOS series)
🤖AI Coding Tools
Flag this post
Metis-SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start
arxiv.org·2d
🏎️TensorRT
Flag this post
Decoding non-invasive brain activity with novel deep-learning approaches
arxiv.org·3d
🧩Attention Kernels
Flag this post
Deep Learning — 7 : Optimize your Neural Networks through Dropouts & Regularization.
pub.towardsai.net·2d
📊Gradient Accumulation
Flag this post
Specialized structure of neural population codes in parietal cortex outputs
nature.com·1d
🧩Attention Kernels
Flag this post
Loading...Loading more...