Zilliz has produced a Milvus Vector Lakebase FAQ to help position its vector database and vector lak ... Read more ›
Everything is going great. Your application launches. You have: 10,000 users Then: 100,000 users Then: 1,000,000 users Life is good. Until one day your database becomes the bottleneck. Queries slow down. CPU usage spikes. Storage fills up. And your single database server starts crying for help. At this point, many engineers discover a concept called: Database Sharding A technique used by some of the largest systems on the internet. Index The Day One Database Stops Scaling What Is Database Sha... Read more ›
A short learning path from a weekend project: I indexed my personal markdown notes (~800 chunks), tried a few local embedding models, stored the same vectors in four different backends, and wired up simple RAG. Not a production guide — just the basics, with honest results from a corpus small enough to reason about. The idea, without the jargon pile Keyword search looks for shared words. Vector search converts text into a list of numbers (an embedding), treats that list as a point in space, an... Read more ›
Consistent Hashing Explained: The Smart Way to Scale Distributed Systems Most system design interviews eventually reach a point where the interviewer asks: "What happens when a new server is added?" At first, scaling sounds easy—just add another machine. But in distributed systems, adding or removing servers can cause massive data movement, cache misses, and performance degradation if data distribution isn't handled correctly. This is exactly the problem Consistent Hashing solves. It is one o... Read more ›
I switched between Cursor and GitHub Copilot as my only AI coding assistant for 30 days each. Here's the honest comparison. My Setup Stack: TypeScript, React, Node.js, PostgreSQL Project: SaaS app with ~15K lines of code Measured: completions accepted, time saved, errors introduced Bottom Line First Use Cursor if you want the most powerful AI coding experience and can pay $20/mo Use Copilot if you're on JetBrains or need enterprise features at $10/mo Round 1: Autocomplete Quality Both use fro... Read more ›
Agents are only as intelligent as the context they can reason over. Today, that context is scattered across data lakes, data warehouses, lakehouses, databases, and streams, and in institutional knowledge that has never been written down. You want to trust the decisions made by your AI agents, but that can't happen until agents have context. Imagine what becomes possible when we give agents a safe way to access the context they need to deliver trusted decisions. This is why at the AWS Summit N... Read more ›
Time-series data is one of the most common types of data generated by modern applications. Every log entry, API request, metric, transaction, sensor reading, or user interaction is recorded with a timestamp, making time the primary dimension for analysis. As organizations collect billions of these records, efficiently storing and querying them becomes increasingly challenging. This is where ClickHouse® excels. Although ClickHouse is not a dedicated time-series database, its columnar storage a... Read more ›
Every git user eventually has that moment. The terminal returns. The working directory looks wrong. You type git log and the last two hours of work are simply not there. Usually, this is not data loss. It is a solvable problem, and git reflog is what solves it. What reflog actually is git reflog records every position HEAD has pointed to on your machine. Every commit, checkout, merge, rebase, and reset leaves an entry. Those entries stick around for about 90 days (30 for unreachable commits).... Read more ›
When you create an AWS Lambda function, you choose the runtime that Lambda will use to run your code. This includes the base language version and supporting libraries. Lambda runtimes follow a published deprecation schedule. This means that you must periodically upgrade your function’s runtime. Running on a deprecated runtime means potential security exposure, loss […] Read more ›
DOVER, DELAWARE, USA - June 17, 2026 — The Rust Foundation, the nonprofit steward of the Rust programming language, today announced that OpenAI, a leading AI research and deployment company, has joined the organization as a Platinum Member and will contribute a total of $600,000 through the Rust Foundation, including… Read more ›
If your team is using Dagster Cloud's Solo or Starter tiers, the May 2026 pricing update probably gave you a bit of a shock. By removing the monthly credit allowance and charging for each step-executed asset materialization, workspaces that used to cost $10–$100 a month suddenly spiked into the hundreds or even over a thousand dollars. At Narev, we love the developer experience of Dagster, but our 24/7 schedule was driving up our bill to the point where self-hosting Airflow was starting to lo... Read more ›
In this article, we’ll build time-series machine learning models in Python using sktime and explore its core data structures for forecasting workflows. Read more ›
Local emulator for Google BigQuery. DuckDB-backed, SQLGlot-powered. Drop-in replacement for the real service in dev, CI, and offline replicas. - jjviscomi/bqemulator Read more ›
I created NewsGraphRAG as a personal deep-dive to explore the boundaries of hybrid local retrieval systems. Built entirely for free and running fully locally on your machine, this project demonstrates how graph databases unlock multi-hop reasoning that traditional flat vector databases simply can't handle. By combining spaCy and Ollama (llama3.2) for a two-stage NER pipeline and using Neo4j's built-in vector index, the system extracts interconnected entities from news articles and successfull... Read more ›
Going over an example rebasing a pull request stacked on another with git rebase --onto Read more ›
Secure Credential Management for Automated Deployment Pipelines In modern DevOps environments, automated deployment pipelines are the backbone of continuous delivery. These pipelines often require access to sensitive credentials—API keys, database passwords, SSH keys, cloud provider tokens, and more. Mishandling these secrets can lead to catastrophic security breaches. This post explores a pragmatic approach to credential management using symmetric encryption with Python's cryptography.fernet... Read more ›
I tried solving Leetcode problem #10( Regular Expression Matching) for fun, ended up spending hours on it. problem Link: I took a naive greedy string construction approach to check if the string s matched the pattern p. That worked for simple cases where: p="a*b" s="aaab" output:True But failed in cases where p ="ab*a*c*a" and s ="aaa", output should be True why greedy string construction failed: the flaw was that a* is not a single choice, it can match "", "a", "aa", "aaa" etc. To determine ... Read more ›
Someone sends you a CSV. Then a folder of CSVs. Then a CSV that's actually tab-separated but named .csv, with a stray header row and a column that's a number on most rows and the string N/A on the rest. For years my answer to "can you pull a quick number out of this?" was a throwaway Python script. Read it in, fight pandas about dtypes, groupby, print, delete the script, forget everything, repeat next week. It worked. It was also slow and I never kept any of it. These days I just point SQL at... Read more ›
The MCP Gateway Registry is an open-source (Apache 2.0 License) project from the Agentic Community that provides a single, governed control plane for every AI asset in an organization: MCP) servers, AI agents, skills, and custom assets. It is both a gateway (one entry point that routes client reques Read more ›
Connect PostgreSQL and run SQL with built-in AI operators through samtSQL. Read more ›