This is Part 1 of a two-part series covering the key features in PostgreSQL 18. In this post, we focus on performance enhancements: skip scan optimization for multicolumn indexes, enhanced EXPLAIN output, automatic removal of unnecessary self-joins, and several vacuum and autovacuum improvements that help keep your database running efficiently. Read more ›
We argue here that traditional network models, which are overwhelmingly based on the mathematical construct of a simple graph, are fundamentally insufficient for capturing the complexity of modern distributed systems. Such systems are characterized by heterogeneous agents with diverse capabilities, high-dimensional and multi-modal data streams, and intricate, context-dependent relationships that cannot be adequately described by a simple connect... Read more ›
How can cloud providers efficiently supply durable virtual block devices? Remote Direct Memory Access (RDMA) provides a way for servers in a cluster to share chunks of memory, but there still needs to be a protocol that operates on top of RDMA to provide the guarantees expected of a block device. The kernel's RDMA transport library (RTRS) provides a way to send messages via RDMA. I : Reliable Multicast over RTRS (RMR) and Block device over RMR (BRMR). These modules, which I am working on with... Read more ›
Managing data retention policies is one of the most common operational tasks in MySQL. Applications continuously generate transactional, audit, logging, telemetry, and event data. Over time, these tables can grow to billions of rows, causing: Larger backups Longer recovery times Reduced buffer pool efficiency Slower index maintenance Increased storage costs Degraded query performance To address … The post appeared first on <a href=" Read more ›
In this post, we show you how to use EXPLAIN plans to diagnose and improve query performance in Amazon Aurora DSQL. We introduce a three-layer filter model as a practical framework for understanding where your predicates are evaluated, and walk through the architecture differences that make Aurora DSQL plans unique, the anatomy of an EXPLAIN output, access method selection, and a step-by-step query improvement workflow. Read more ›
Generative recommendation is an emerging paradigm that has shown promise in industrial recommendation systems, aiming to predict users' next interactions from their historical behaviors. At the core of generative recommendation lies item tokenization, which bridges item semantics and recommendation models. However, existing methods often struggle to effectively organize and inject complex user-behavioral and item-semantic contexts into recommend... Read more ›
Vector databases typically manage metadata as flat scalar attributes, which limits their ability to express hierarchical directory semantics commonly used to organize code repositories, enterprise documents, and agent memories. As a result, directory-scoped retrieval and structural updates are often implemented as application-layer workarounds, making recursive scope resolution expensive and directory maintenance difficult to keep consistent. Th... Read more ›
Add fuzzy string matching to MySQL with VillageSQL. Learn to use trigrams for typos, Levenshtein distance for spell correction, and phonetic matching. Read more ›
Remote and disaggregated memory tiers expand the effective memory capacity of analytical database engines, but they also reshape the cost structure of out-of-memory query processing. When an operator spills beyond local DRAM, moving pages to remote memory incurs both data-transfer time and a fixed round-trip latency per transfer. Classical operator analyses and buffer-allocation heuristics primarily target disk spilling by minimizing total I/O v... Read more ›
Log-structured merge (LSM) trees attach an approximate-membership filter to every run and must split a fixed memory budget across them. The static optimum is known (Monkey); a large systems literature then makes the allocation adaptive, tracking shifting hotness online. We ask a prior question: when is that adaptivity worth its machinery? We give three analytical answers and validate them on synthetic sweeps, real Twitter production cache traces... Read more ›
In Part 1 of this series, we explored the performance enhancements in PostgreSQL 18, including skip scan optimization, enhanced EXPLAIN output, automatic self-join removal, and vacuum/autovacuum improvements. In this second part, we focus on security, monitoring, developer productivity, and logical replication enhancements that improve operational efficiency and the overall developer experience. Read more ›
In this post, we show you how to use Amazon CloudWatch Database Insights for lock analysis in Amazon Aurora PostgreSQL. You learn how to enable the feature, interpret lock tree visualizations, resolve common lock-related issues, and maintain optimal database performance. This lock tree analysis feature also applies to Amazon RDS for PostgreSQL. Read more ›
Security updates have been issued by AlmaLinux (hplip, kernel, kernel-rt, libpng12, libpng15, libxml2, libxslt, mysql:8.0, mysql:8.4, opencryptoki, openssl, postfix, postgresql:15, rsync, and webkit2gtk3), Debian (asterisk, atril, gsasl, and libreoffice), Fedora (ack, bird, chromium, firefox, ldns, librabbitmq, nextcloud, nss, openslide, perl-Protocol-HTTP2, tig, vorbis-tools, and xen), Mageia (coturn, log4cxx, and python-tornado), SUSE (389-ds, buildah, container-suseconnect, distribution, e... Read more ›
Database configuration tuning is critical for workload performance, but practical tuning on real deployments remains difficult. Existing automatic tuners mostly formulate tuning as iterative search over DBMS knob values. This formulation leads to high execution cost, prematurely narrows the configuration space, and leaves practical requirements insufficiently addressed: diagnosing runtime bottlenecks from system feedback, exploring OS-level reco... Read more ›
Security updates have been issued by AlmaLinux (.NET 9.0), Debian (apache2, chromium, jpeg-xl, librabbitmq, and openssl), Fedora (apptainer, bind9-next, chezmoi, chromium, collectd, composer, dnsdist, gh, python-django5, python-python-multipart, varnish, varnish-modules, vmod-querystring, vmod-uuid, weasyprint, and xorg-x11-server-Xwayland), Mageia (cups, expat, libpng, libssh, memcached, nghttp2, openimageio, packages, proftpd, and radare2), Oracle (.NET 10.0, .NET 8.0, .NET 9.0, and firefox... Read more ›
PLRTune: Importance Pre-Sampling and LLM-Guided Reinforcement Learning for Automatic Database Tuning
Configuration tuning is critical to database performance, yet automatic database tuning remains challenging due to high-dimensional knob spaces, substantial online tuning cost, unreliable textual hints derived from Large Language Models (LLMs) or community documents, and the difficulty of exploiting the remaining optimization room after initialization. Hence, we propose PLRTune, a staged database tuning system that leverages workload-specific do... Read more ›
Group commit amortizes the fixed cost of a durable log flush across many committing transactions; the release rule - a timer, a batch size, or an adaptive policy - is a classic tuning knob. The textbook theory is open-loop: for Poisson arrivals the optimal timer is the EOQ square-root rule, and the wait-or-flush decision is ski-rental 2-competitive. We ask when that tuning is worth its machinery, and show that in closed-loop OLTP it usually is n... Read more ›
Cardinality-estimation (CE) research ranks estimators by q-error, yet it is well known that q-error is an imperfect proxy for query-plan quality. We give a measurement-driven account of when it is a good proxy and when it is not, and why. Modeling plan selection as an argmin over a piecewise-linear cost landscape, we find that plan regret (the cost of the chosen plan relative to the optimal, under true cardinalities) is governed by plan-cost geo... Read more ›