Building Language Tech for Meghalaya: Lessons from Tokenizing Khasi and Garo with Modern LLMs
dev.to·2d·
Discuss: DEV
Tokenizer Benchmarks
Building a 60,000 RPS Time-Series Data Ingestion Pipeline in Go
tsharma.bearblog.dev·2d
📮Persistent Queues
Parallelism Strategies in Deep Learning
afmck.in·8h·
Discuss: Hacker News
🔀SIMD Programming
Building Search for this Site – Search on a static site
alexleighton.com·1d·
Discuss: Hacker News
📋Tablegen
Benchmarking Document Parsing (and What Actually Matters)
unstructured.io·23h
🧠Semantic Parsing
Start using YesChat and it will save you a lot of time and money.
dev.to·3h·
Discuss: DEV
🔄Coroutines
[D] Mixture of Attention?
reddit.com·14h·
🔄Subinterpreters
GitHub - vshakitskiy/how-to-otp: Learn how to work with OTP in Gleam!
github.com·1d
Gleam
PILOT: Steering Synthetic Data Generation with Psychological & Linguistic Output Targeting
arxiv.org·19h
💬Interactive REPLs
Concurrent Linguistic Error Detection (CLED): a New Methodology for Error Detection in Large Language Models
arxiv.org·5d
🧪Parser Testing
ML-based profiling of data skew and bottlenecks on Databricks
dev.to·13h·
Discuss: DEV
🔮Speculative Execution
Token Models as Statistical Simulations: A Different Take
medium.com·23h·
Discuss: Hacker News
🔍Tokenizers
LLM-Deflate: Extracting LLMs into Datasets
scalarlm.com·2d·
Discuss: Hacker News
🪜Recursive Descent
Containerization Without the Cloud: Running Docker Locally for Fun and Speed
dev.to·4h·
Discuss: DEV
🔗Redis Protocols
Achieving TB-Level Aggregate Bandwidth: How JuiceFS Optimized Distributed Cache Network
dev.to·13h·
Discuss: DEV
🌍HTTP Servers
Web Developer Travis McCracken on Why I Use Rust for Stateless Microservices
dev.to·10h·
Discuss: DEV
🔧API Design