Erdwig's Feed · Scour

ZSTD –auto it picks the optimal compression level for you

zstd with --auto: picks the optimal compression level for you. No more guessing between 1-22. Tested on 320 files across 8 types. 0 corruption, 98% beat default. Drop-in for facebook/zstd. Uses fra... Read more ›

Covers 2 stories including GCC GNU website is down

Discussed on Hacker News

🗂️Columnar Storage Corrode Rust Consulting Blog·

ClickHouse

There’s a particular kind of pressure that comes with maintaining software at the very bottom of someone else’s stack. ClickHouse lives in exactly that spot: roughly 1.5 million lines of mostly C++ and tens of millions of tests every single day. So what happens when you start introducing Rust into a codebase like that? Not as a rewrite, but linked into a C++ server with a CMake build process that has to be reproducible and FIPS compliant? In today’s episode, we get into the messy, interesting... Read more ›

Covers 5 stories including Fast Open-Source OLAP DBMS

Discussed on Hacker News

🧮Hindley-Milner discuss.ocaml.org·

OCaml 5.5.0

We have the pleasure of celebrating the birthday of Blaise Pascal by announcing the release of OCaml version 5.5.0. Some of the highlights in OCaml 5.5.0 are: Module-dependent Functions Modules can now be used as function arguments in a form of lightweight functors. For instance, we can define a function for printing a map generated by the Map.Make functor: let pp_map (module M: Map.S) pp_key pp_v ppf set = if M.is_empty set then Format.fprintf ppf "ø" else let pp_sep ppf () = Fo... Read more ›

Discussed on Hacker News, Hacker News, and Lobsters

💻Programming Languages abm-77.dev·

Adventures in Open-Source: the lit profraw race

How a filename collision in lit's --per-test-coverage produced malformed profraws that crashed llvm-profdata, and the two upstream PRs that fix it. Read more ›

🔗CRDT Composition arxiv.org·

A Composable CRDT Layer for Byzantine-Resilient Deterministic Reconstruction

Conflict-free Replicated Data Types (CRDTs) ensure Strong Eventual Consistency without coordination, but typically assume benign participants and rely on validation or exclusion to handle Byzantine behavior. We address this problem through deterministic state reconstruction: rather than deciding which updates are admissible, all accepted updates are incorporated, while only a subset contributes to the reconstructed state. We instantiate this app... Read more ›

🧮Algorithms leetcode.com

First exposure to Dynamic programming

I tried solving Leetcode problem #10( Regular Expression Matching) for fun, ended up spending hours on it. problem Link: I took a naive greedy string construction approach to check if the string s matched the pattern p. That worked for simple cases where: p="a*b" s="aaab" output:True But failed in cases where p ="ab*a*c*a" and s ="aaa", output should be True why greedy string construction failed: the flaw was that a* is not a single choice, it can match "", "a", "aa", "aaa" etc. To determine ... Read more ›

Discussed on DEV

🏷️Named Entity Recognition medium.com

The AI Model That Hijacks the Computer That Loads It

Two models on Hugging Face looked ordinary — but the instant you loaded one, it opened a backdoor to your machine. The trick: a file… Read more ›

🏗Datastructures donraab.medium.com·

Donald Raab: More Features, Less Waste

The lesser known tagline of Eclipse Collections.The footer of is an open source collections library for Java that has been in development since 2004.There are two things works hard not to waste.MemoryTimeMuch of the feature development in Eclipse Collections focuses on the optimization of these two things.MemoryThere are two kinds of memory that Eclipse Collections optimizes for.Data Structure MemoryAlgorithm MemoryIf you look closely at the image above, you will see some old code examples in... Read more ›

🌳B-Trees abderahmanetoumi.medium.com·

Building a B+Tree in Rust: What I Got Wrong First Part

I’m building a B+Tree as part of a database project, style but in Rust, inspired by QuillSQL and the Bustub by CMU. This post was a set of… Read more ›

🎲Probabilistic Inference arxiv.org·

Stop the Sampler! Classifier-Based Adaptive Stopping for Sampling Kernels

Sampling from complex, unnormalized probability densities is a fundamental challenge in Bayesian inference and probabilistic modeling. While Markov chain Monte Carlo (MCMC) methods provide asymptotic guarantees, they often suffer from slow mixing and high computational costs due to fixed or manually tuned trajectory lengths. In this work, we propose a novel framework that treats trajectory termination as a learnable component of the sampling dyn... Read more ›

🗄Databases samtsql.com·

Try AI Operators on PostgreSQL

Connect PostgreSQL and run SQL with built-in AI operators through samtSQL. Read more ›

Discussed on Hacker News

⚡Quantization GitHub·

ultralytics/ultralytics v8.4.74

#Python 🌟 Summary Ultralytics v8.4.74 focuses on more reliable model export and quantization 🔧—especially fixing INT8 export stability on affected GPU setups and preventing flaky OpenVINO export failures on NMS-enabled models. 📊 Key Changes 🚨 INT8 calibration now always runs on CPU during ModelOpt export In the most important change from PR unconditionally on the CPU execution provider. This replaces the earlier GPU/RTX detection logic, which was found to be unreliable in real-world use. The ... Read more ›

🧩Constraint Programming John D. Cook·

All pieces on a 6 by 5 board

I’ve written a couple posts lately on getting an LLM to generate code to solve chess problems. The first used Claude to generate Prolog and the second used ChatGPT to generate Prolog. This post will use Claude to generate Z3/Python code. The puzzle is one I’ve written about before: Place all the pieces—king, queen, two […] The post first appeared on . Read more ›

🗂️Hash Tables Andrey Listopadov·

Better slopes in AABB collision systems

Earlier this year I on how to implement slopes in AABB collision resolution using bump.lua. The resulting system worked, but was a bit hard to use in-game and had a few issues, so I wouldn’t consider it a viable solution. The main issue with the old approach was the use of the cross collision response. It allowed the object to move through the slope as if it wasn’t there at all, and then the update loop corrected the y position after the fact. This alone makes writing code the game a lot more... Read more ›

⚡Effect Systems SoraNews24 —Japan News—·

Japanese ninja certification exam attracts 131 candidates from Japan and abroad

Shadow warrior test includes a written exam, with marks given for shuriken throwing and ninja attire. On 14 June, 131 aspiring modern-day ninjas descended on Koka City in Shiga Prefecture to test their ninja prowess by taking a special ninja certification test. Known as the Koka-ryu Ninja Certification, with “Koka-ryu” meaning “Koka School”, the exam […] Read more ›

🌳Trie arxiv.org·

AREAL-DTA: Dynamic Tree Attention for Efficient Reinforcement Learning of Large Language Models

Reinforcement learning (RL)-based post-training for large language models (LLMs) is computationally expensive, as it generates many rollout sequences that frequently share long token prefixes. Existing RL frameworks usually process these sequences independently during policy training, i.e., repeatedly recomputing identical prefixes in both the forward and backward passes of policy gradient computation, leading to substantial inefficiencies i... Read more ›

🗜️Compression Algorithms mattmahoney.net·

Data Compression Explained

Matt Mahoney Read more ›

Covers 2 stories including TIL that Occam’s razor isn’t about the “simplest” idea, but about choosing the explanation that adds the fewest new assumptions or speculative entities.

Covered by ClickHouse Blog

Discussed on Hacker News

🏗Datastructures medium.com

Understanding Data Structures by Building a Contact Book in Python

Data structures sound scary. They are not. Let this simple project show you exactly what they are and why they matter. Read more ›

🧮Algorithms tiki.li·

What makes blqsort faster than almost any other Quicksort around – with C and C++ interfaces

This blog supplements and corrects this post in a few respects.: Fast Branchless Quicksort. Read more ›

Discussed on r/programming

🔀CRDTs GitHub·

Show HN: Use math to track distributed progress without a central coordinator

Calculate distributed progress without a central leader. You use this pure mathematical primitive to merge partially ordered timestamps across your network. Its lattice algebra guarantees your work... Read more ›

Discussed on Hacker News