Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Data Engineering
🔧 Data Engineering
data pipeline, ETL, data lakehouse, batch processing
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
229
posts in
8.2
ms
New comment by mkolarek in "Ask HN: Who wants to be hired? (June 2026)"
🐘
PostgreSQL
Content type:
PDF
markokolarek.com
·
4d
4 days ago
·
Hacker News
Actions for New comment by mkolarek in "Ask HN: Who wants to be hired? (June 2026)"
Introducing Flights: Agent-Native
Ingest
in MotherDuck
📊
OLAP
Content type:
Blog
motherduck.com
·
1d
1 day ago
Actions for Introducing Flights: Agent-Native Ingest in MotherDuck
CSU Student of Distinction: Anthony Arthur
⚙️
Systems Programming
Content type:
Academic
csuohio.edu
·
22h
22 hours ago
Actions for CSU Student of Distinction: Anthony Arthur
Central Bank strengthens
data
governance for
AI
solutions
⚡
Query Optimization
Content type:
News
en.apa.az
·
1d
1 day ago
Actions for Central Bank strengthens data governance for AI solutions
Embedding
pipelines
are the new
ETL
📊
OLAP
Content type:
Blog
infoworld.com
·
6d
6 days ago
Actions for Embedding pipelines are the new ETL
Integration Patterns: How To Choose for Your Architecture
📊
OLAP
Content type:
Blog
blog.n8n.io
·
2d
2 days ago
Actions for Integration Patterns: How To Choose for Your Architecture
Databricks
wants to kill the “email me a file” problem for
AI
agent skills
📊
OLAP
thenewstack.io
·
19h
19 hours ago
Actions for Databricks wants to kill the “email me a file” problem for AI agent skills
Redis
Data
Integration in Redis Cloud is now GA in AWS
📊
OLAP
Content type:
Blog
redis.io
·
6d
6 days ago
Actions for Redis Data Integration in Redis Cloud is now GA in AWS
Towards Post-Quantum Secure Pharmacovigilance with ML-KEM and ML-DSA
🌐
Distributed Systems
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Towards Post-Quantum Secure Pharmacovigilance with ML-KEM and ML-DSA
The Hidden Tax Killing Your ML Team’s Velocity – And the Architecture Decision That Fixes It
📊
OLAP
Content type:
Blog
medium.com
·
5d
5 days ago
Actions for The Hidden Tax Killing Your ML Team’s Velocity – And the Architecture Decision That Fixes It
OpenGov and
Snowflake
build a knowledge graph to unify government
data
and
AI
📊
OLAP
Content type:
Video
siliconangle.com
·
5d
5 days ago
Actions for OpenGov and Snowflake build a knowledge graph to unify government data and AI
Microsoft just shared the frontier
data
engineering
secrets
📊
OLAP
mail.bycloud.ai
·
1d
1 day ago
Actions for Microsoft just shared the frontier data engineering secrets
Connections, Roles, and
Warehouses
: Getting CoCo Desktop Production-Ready from Day One
📊
OLAP
Content type:
Blog
towardsai.net
·
3d
3 days ago
Actions for Connections, Roles, and Warehouses: Getting CoCo Desktop Production-Ready from Day One
Deploy ADX Business on DigitalOcean
📊
OLAP
digitalocean.com
·
1d
1 day ago
Actions for Deploy ADX Business on DigitalOcean
benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers,
dbt
+
Airflow
recipes.
🐘
PostgreSQL
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.
Announcing general availability of
Apache
Spark
4.0 on Amazon EMR
📋
Data Formats
Content type:
Blog
aws.amazon.com
·
1d
1 day ago
Actions for Announcing general availability of Apache Spark 4.0 on Amazon EMR
From Legacy Custom Logging to Native Structured Logging in
Dataflow
⚙️
Systems Programming
Content type:
Blog
medium.com
·
7h
7 hours ago
Actions for From Legacy Custom Logging to Native Structured Logging in Dataflow
When Feature Importance Lies: Target Encoding at the Noise Floor
⚙️
Systems Programming
flyback.ai
·
2d
2 days ago
·
DEV
Actions for When Feature Importance Lies: Target Encoding at the Noise Floor
New comment by aldoakhanov in "Ask HN: Who wants to be hired? (June 2026)"
🐘
PostgreSQL
castlefootyai.com
·
5d
5 days ago
·
Hacker News
Actions for New comment by aldoakhanov in "Ask HN: Who wants to be hired? (June 2026)"
Streaming
and
Batch
Data
Architectures with Microsoft Fabric to Azure Databricks
📋
Data Formats
techcommunity.microsoft.com
·
1d
1 day ago
Actions for Streaming and Batch Data Architectures with Microsoft Fabric to Azure Databricks
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help