Tokenization
ABLE: Representing and Mapping LLMs via Attribution-Based Large-model Embedding
🤖LLM Content type: Academiclbj96347/nemotron-3.5-asr-ios: On-device, offline speech recognition for iPhone/iPad using NVIDIA's Nemotron-3.5-ASR Streaming 0.6B (multilingual) via CoreML.SwiftUI app with mic capture + audio file import, RNN-Tdecoding, and live benchmark metrics (latency, RTF, memory).
🤖Data science Content type: CodeSIDInspector: A Mapping-First Diagnostic Resource for Semantic-ID Tokenizers
🪟Context Windows Content type: AcademicUniDexTok: A Unified Dexterous Hand Tokenizer from Real Data
💬Natural Language Processing Content type: AcademicF3-Tokenizer: Taming Audio Autoencoder Latents for Understanding and Generation
🤖LLM Content type: AcademicHybridCodec: Fast Dual-Stream, Semantically Enhanced Neural Audio Codec
🔢Embeddings Content type: AcademicDREAM: Dynamic Refinement of Early Assignment Mappings
💬Natural Language Processing Content type: AcademicReversible Foundations: Training a 120B Sparse MoE through State-Preserving Scaling
🤖LLM Content type: AcademicNo more posts from saeedesmaili's subscribed feeds.