๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
โš™๏ธ Compression Benchmarking

Algorithm Comparison, Speed vs Ratio, Corpus Testing, Performance Metrics

Machine Learning Fundamentals: accuracy with python
dev.toยท1dยท
Discuss: DEV
๐Ÿ‘๏ธObservatory Systems
Ask HN: Feedback on "QSS" โ€“ A Quantized Vector Search Engine in C
news.ycombinator.comยท1dยท
Discuss: Hacker News
๐Ÿ—‚๏ธVector Databases
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation
arxiv.orgยท1d
โšกModern Compression
The one-more-re-nightmare compiler (2021)
applied-langua.geยท1dยท
Discuss: Lobsters, r/programming
๐Ÿ”RegEx Engines
The Bitter Lesson is coming for Tokenization
lucalp.devยท1dยท
Discuss: Lobsters, Hacker News, r/programming
๐Ÿ”—Monadic Parsing
Portable Network Graphics (PNG) Specification (Third Edition)
w3.orgยท21hยท
Discuss: Hacker News
๐Ÿ•ธ๏ธWebP Analysis
Re-Evaluating Code LLM Benchmarks Under Semantic Mutation
arxiv.orgยท1d
๐Ÿ“Code Metrics
New: Improve Apache Iceberg query performance in Amazon S3 with sort and z-order compaction
aws.amazon.comยท21h
๐Ÿ”„Burrows-Wheeler
Automattic/harper: Offline, privacy-first grammar checker. Fast, open-source, Rust-powered
github.comยท1d
๐Ÿ“Concrete Syntax
Contextualizing SUTRA: Advancements in Multilingual & Efficient LLMs
hackernoon.comยท2h
๐Ÿ’ปLocal LLMs
Data Compression with Relative Entropy Coding
arxiv.orgยท2d
๐Ÿ“Compression Mathematics
Greedy Is Good. Less Greedy May Be Better
gojiberries.ioยท17hยท
Discuss: Hacker News
๐ŸงฎKolmogorov Complexity
Using an LLM for query planning in RAG โ€“> 40% better answer relevance
techcommunity.microsoft.comยท22hยท
Discuss: Hacker News
๐Ÿ”Information Retrieval
The collective waste caused by poor documentation
shanrauf.comยท16hยท
Discuss: Hacker News
๐Ÿ“ฆDeflate
The more LLMs think, the worse they translate
nuenki.appยท6hยท
Discuss: Hacker News
๐Ÿ’ปLocal LLMs
ByteSpan: Information-Driven Subword Tokenisation
arxiv.orgยท1d
๐Ÿ’พBinary Linguistics
Floating-Point Data Transformation for Lossless Compression
arxiv.orgยท1d
๐ŸŒŠStreaming Compression
[R] [ClsToken, AvgPool] can be a poor choice for transformer embedding models
reddit.comยท2dยท
Discuss: r/MachineLearning
๐ŸŒŠStreaming Compression
MFTCXplain: A Multilingual Benchmark Dataset for Evaluating the Moral Reasoning of LLMs through Hate Speech Multi-hop Explanation
arxiv.orgยท14h
๐Ÿ“‹Document Grammar
Why Your Next LLM Might Not Have A Tokenizer
towardsdatascience.comยท22h
๐Ÿค–Grammar Induction
Loading...Loading more...
AboutBlogChangelogRoadmap