Text Mining

Document Analysis, Pattern Discovery, Information Extraction, Natural Language Processing

Feeds to Scour
SubscribedAll
Scoured 44 posts in 17.1 ms

MeshTok: Efficient Multi-Scale Tokenization for Scalable PDE Transformers

 🔤Character Classification  Content type: Academic
arxiv.org·

Why REAL Finance sees infrastructure as the next phase of tokenized finance

 🔤Character Classification  Content type: News
thenextweb.com·

Command injection in NLTK collocations via eval()

 🛡️CLI Security

The New Capital Stack: How Governments Are Reshaping Critical Minerals Finance

 🔤Character Classification
hackernoon.com·

How LLMs Actually Work: A Friendly Map for Humans • oreoro

 📐Linear Algebra

A Taxonomy of Real-World Asset Tokenization for Blockchain-Based Financial Infrastructure

 🔤Character Classification  Content type: Academic
arxiv.org·

DREAM: Dynamic Refinement of Early Assignment Mappings

 🔤Character Classification  Content type: Academic
arxiv.org·

LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling

 🔤Character Classification  Content type: Academic
arxiv.org·

Neural Field Tokenizations with Hierarchy and Spatial Locality Priors

 🧠Machine Learning  Content type: Academic
arxiv.org·

CleanCodec: Efficient and Robust Speech Tokenization via Perceptually Guided Encoding

 🔤Character Classification  Content type: Academic
arxiv.org·

AdaTok: Self-Budgeting Image Tokenization with Quality-Preserving Dynamic Tokens

 🔤Character Classification  Content type: Academic
arxiv.org·

ChannelTok: Efficient Flexible-Length Vision Tokenization

 🔤Character Classification  Content type: Academic
arxiv.org·

Balancing Image Compression and Generation with Bootstrapped Tokenization

 🔤Character Classification  Content type: Academic
arxiv.org·

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

 📄Document AI  Content type: Academic
arxiv.org·

SMADE-IE: Sparse Multi-Agent Framework with Evidence-Driven Debate for Zero-Shot Information Extraction

 📄Document AI  Content type: Academic
arxiv.org·

Stress-testing medical large language models reveals latent safety pathology beyond benchmark accuracy

 📄Document AI  Content type: Academic
arxiv.org·

LimiX-2M: Mitigating Low-Rank Collapse and Attention Bottlenecks in Tabular Foundation Models

 🔤Character Classification  Content type: Academic
arxiv.org·

MeshWeaver: Sparse-Voxel-Guided Surface Weaving for Autoregressive Mesh Generation

 🔤Character Classification  Content type: Academic
arxiv.org·

GOTabPFN: From Feature Ordering to Compact Tokenization for Tabular Foundation Models on High-Dimensional Data

 🗜️Graph Compression  Content type: Academic
arxiv.org·

Sustainability by Design in Decentralized Autonomous Organizations: An Empirical Review of Governance, Innovation, and Institutional Design

 📚Document Clustering  Content type: Academic
arxiv.org·

No more posts from matmat's subscribed feeds.

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help