Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
✂️ Tokenization
Text Splitting, Word Boundaries, NLP Pipeline, Lexical Analysis
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
146311
posts in
12.8
ms
Faster
Superword
Tokenization
🌱
Stemming
arxiv.org
·
1d
Tokenizer That Outperform
Tiktoken
with
O200k
_base
🌱
Stemming
o200k-tokenizer-70fe25.gitlab.io
·
5d
·
Hacker News
Why Modern LLMs Prefer
Subword
Tokenization
🌱
Stemming
ai.plainenglish.io
·
2d
Tsinghua
's Multi-Agent AI Classroom, Anthropic's Context Engineering Playbook, and a 54 LLM-Architecture Gallery - 📚 The
Tokenizer
Edition #22
💬
Prompt Engineering
newsletter.artofsaience.com
·
6d
Recursive
Language Models - A
Systematic
Approach to Large-Scale Document Analysis – Part I.
💬
Natural Language Processing
constitutionaldiscourse.com
·
2d
How to
Generate
Text in One Step
🔢
Kolmogorov Complexity
one-step-lm.github.io
·
1d
Building
GPT
from
Scratch
🤖
Transformer Architecture
medium.com
·
1d
Launching on Product Hunt tomorrow: pay-per-use text AI with
USDC
micropayments
(no subscription, 100 free credits)
🌱
Stemming
producthunt.com
·
2d
·
DEV
MARS:
Enabling
Autoregressive
Models Multi-Token Generation
🔗
RAG
arxiv.org
·
5h
Geometric Properties of the
Voronoi
Tessellation
in Latent Semantic Manifolds of Large Language Models
🔢
Kolmogorov Complexity
arxiv.org
·
5h
Token-Efficient
Multimodal Reasoning via Image Prompt
Packaging
💬
Prompt Engineering
arxiv.org
·
3d
Trivial
Vocabulary Bans Improve LLM Reasoning More Than Deep
Linguistic
Constraints
🧠
LLM Reasoning
arxiv.org
·
3d
How Alignment Routes:
Localizing
, Scaling, and
Controlling
Policy Circuits in Language Models
🧠
LLM Reasoning
arxiv.org
·
2d
Rethinking
Token Prediction:
Tree-Structured
Diffusion Language Model
🧠
LLM Reasoning
arxiv.org
·
2d
Short Data, Long Context:
Distilling
Positional
Knowledge in Transformers
🤖
Transformer Architecture
arxiv.org
·
1d
Beneath the Surface: Investigating LLMs' Capabilities for
Communicating
with
Subtext
🧠
LLM Reasoning
arxiv.org
·
1d
Multilingual Language Models
Encode
Script Over
Linguistic
Structure
🔢
Kolmogorov Complexity
arxiv.org
·
1d
Hierarchical
SVG
Tokenization
: Learning Compact Visual Programs for Scalable Vector Graphics Modeling
🔗
RAG
arxiv.org
·
1d
One Model for All:
Multi-Objective
Controllable
Language Models
🤖
Transformer Architecture
arxiv.org
·
2d
The
Geometric
Alignment Tax:
Tokenization
vs. Continuous Geometry in Scientific Foundation Models
🔬
Scientific Modeling
arxiv.org
·
2d
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help