Tokenization

Feeds to Scour
SubscribedAll
Scoured 82 posts in 8.9 ms

Time Series as Language: A Universal Tokenizer for General-Purpose Time Series Foundation Models

 🪟Context Windows  Content type: Academic
arxiv.org·

OpenCV 5.0 Computer Vision Library Released with Rewritten DNN Engine

 🐍Python
linuxiac.com·

TilelliLab/atome-lm: A ternary, zero-heap tiny language model that runs inside a $2 microcontroller — bit-exact Python <-> C99 <-> Cortex-M3 (QEMU) parity. Apache-2.0.

 📶ESP32  Content type: Code
github.com··r/LLM

SIDInspector: A Mapping-First Diagnostic Resource for Semantic-ID Tokenizers

 🪟Context Windows  Content type: Academic
arxiv.org·

Benchmarking dots.tts on Strix Halo

 🔬Deep Learning
sleepingrobots.com·

From 2D Grids to 1D Tokens: Reforming Shared Representations for Multimodal Image Fusion

 🔥PyTorch  Content type: Academic
arxiv.org·

OpenCV Introduces New DNN Inference Engine

 🤖Machine Learning
i-programmer.info·

Run an Apache Airflow DAG with Docker Compose and PostgreSQL

 🔍Information Retrieval
pyimagesearch.com·

UniDexTok: A Unified Dexterous Hand Tokenizer from Real Data

 💬Natural Language Processing  Content type: Academic
arxiv.org·

inflightsec/agent-vault-proxy: Just-in-time API keys for AI agents - and any other process you route through it: the caller only ever sees a placeholder.

 🏠Self-hosting  Content type: Code
github.com··Hacker News

Part 3

 💬Natural Language Processing  Content type: Blog
modular.com·

Show HN: SupXML, modern memory-safe XML parser replacement for libxml2

 📡RSS
supso.org··Hacker News

ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

Open source building blocks for computational design. Est. 2006

 🕸️Knowledge Graphs
thi.ng··Hacker News

SniperRavan/Trace: Ambient AI usage tracker extension for ChatGPT, Claude, Gemini, Grok, and Perplexity.

 🪟Context Windows  Content type: Code

Sycophancy as a Multilingual Alignment Failure: How Safety Degrades Across Languages, Topics, and Models

 🤖LLM  Content type: Academic
arxiv.org·
Less-relevant results

Usage drill-down instead of loading the whole tool surface

 🪨Obsidian  Content type: Blog
medium.com·

ajayr4j/pgtoken: PostgreSQL extension for compact, codebook-compressed LLM token ID storage. Encode once, retrieve forever. O(1) token count without decoding.

 💬Natural Language Processing  Content type: Code
github.com··r/PostgreSQL

Operationalizing Linguistic Methods through Prompt-Engineering Skills: An Automatic Chinese Web Neologism Detection Pipeline

 🤖LLM  Content type: Academic
arxiv.org·

Five labs, five minds: building a multi-model finance drama on small models

 🤖Data science  Content type: Blog
huggingface.co·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help