Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Data Engineering
⚙️ Data Engineering
ETL, Data Pipelines, Data Warehousing, Data Processing
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
225
posts in
13.7
ms
The Considerate
Data
Modeler
🗄️
Databases
oranlooney.com
·
6d
6 days ago
·
Hacker News
Actions for The Considerate Data Modeler
“Whoever builds the most joyous product wins”: The agent
war
begins
⚙️
Compilers
thenewstack.io
·
4d
4 days ago
Actions for “Whoever builds the most joyous product wins”: The agent war begins
How to Build
Data
Pipelines
That Resist Partition Drift
🗄️
Database Engines
hackernoon.com
·
1d
1 day ago
Actions for How to Build Data Pipelines That Resist Partition Drift
DuckLake Spec, pg_background 2.0, and pgsql_tweaks 1.0.3 Enhance
Database
Ecosystem
🗄️
Database Design
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for DuckLake Spec, pg_background 2.0, and pgsql_tweaks 1.0.3 Enhance Database Ecosystem
benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers,
dbt
+
Airflow
recipes.
💛
JavaScript
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.
Why Your
Kafka
Pipeline
Looks Fine in Staging but Breaks in Production
📡
Event-Driven Architecture
hackernoon.com
·
2d
2 days ago
Actions for Why Your Kafka Pipeline Looks Fine in Staging but Breaks in Production
Building a Production-Inspired CSV to PostgreSQL
ETL
Pipeline
with Python
🗄️
Database Design
pub.towardsai.net
·
6d
6 days ago
Actions for Building a Production-Inspired CSV to PostgreSQL ETL Pipeline with Python
Extract
data
from Databases into DuckLake
🗄️
Database Design
Content type:
Blog
dev.to
·
2d
2 days ago
·
DEV
Actions for Extract data from Databases into DuckLake
Your RAG System Might Be Confidently Wrong
🤖
LLMs
hackernoon.com
·
2d
2 days ago
Actions for Your RAG System Might Be Confidently Wrong
Snowflake
Summit 2026 : What Actually Shipped, and What It Means for the People Who Build on It
📡
Event-Driven Architecture
pub.towardsai.net
·
3d
3 days ago
Actions for Snowflake Summit 2026 : What Actually Shipped, and What It Means for the People Who Build on It
From
Data
Quality Checks to Analytics-Ready Parquet with Python
🗄️
Database Design
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for From Data Quality Checks to Analytics-Ready Parquet with Python
Turso libSQL vs Cloudflare D1 for an Astro monorepo: the practical difference
🗄️
Database Design
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for Turso libSQL vs Cloudflare D1 for an Astro monorepo: the practical difference
go-intake v0.1.0: A Small Go Library for Messy
Data
Intake
📡
Event-Driven Architecture
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for go-intake v0.1.0: A Small Go Library for Messy Data Intake
The State of
Apache
Iceberg Catalogs in June 2026
🔷
Microservices
Content type:
Blog
dev.to
·
2d
2 days ago
·
DEV
Actions for The State of Apache Iceberg Catalogs in June 2026
Data
Engineer
vs.
Data
Scientist: What's the Difference? (2026 Guide for Beginners)
📉
Data Science
Content type:
Blog
dev.to
·
5d
5 days ago
·
DEV
Actions for Data Engineer vs. Data Scientist: What's the Difference? (2026 Guide for Beginners)
COSS Weekly: Supabase achieves $10B valuation, DeepSeek eyes $7B funding round, Martin Scorsese joins Black Forest Labs, and more
🤖
LLMs
Content type:
Blog
dev.to
·
2d
2 days ago
·
DEV
Actions for COSS Weekly: Supabase achieves $10B valuation, DeepSeek eyes $7B funding round, Martin Scorsese joins Black Forest Labs, and more
ETL
Pipeline
: Fetching Real-Time News
Data
with Python and Postgres
📉
Data Science
Content type:
Blog
dev.to
·
3d
3 days ago
·
DEV
Actions for ETL Pipeline: Fetching Real-Time News Data with Python and Postgres
Five overlooked packages running my
AI
directory stack
📘
TypeScript
Content type:
Blog
dev.to
·
1d
1 day ago
·
DEV
Actions for Five overlooked packages running my AI directory stack
Strimzi: Create a simple Mutual TLS (mTLS) authentication
📡
Event-Driven Architecture
Content type:
Blog
dev.to
·
5d
5 days ago
·
DEV
Actions for Strimzi: Create a simple Mutual TLS (mTLS) authentication
From Hours to Seconds: An
AI-Powered
Metadata Catalog for Unstructured
Data
on FSx for ONTAP
🔍
Vector Databases
Content type:
Blog
dev.to
·
2d
2 days ago
·
DEV
Actions for From Hours to Seconds: An AI-Powered Metadata Catalog for Unstructured Data on FSx for ONTAP
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help