Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Data Engineering
π οΈ Data Engineering
data pipelines, ETL, data warehouse, dbt, Apache Spark
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
261
posts in
7.8
ms
OpenGov and
Snowflake
build a knowledge graph to unify government
data
and
AI
Β
π
Data Pipelines
Β
Content type:
Video
siliconangle.com
Β·
5d
5 days ago
Actions for OpenGov and Snowflake build a knowledge graph to unify government data and AI
Microsoft just shared the frontier
data
engineering
secrets
Β
π
Data Pipelines
mail.bycloud.ai
Β·
1d
1 day ago
Actions for Microsoft just shared the frontier data engineering secrets
Central Bank strengthens
data
governance for
AI
solutions
Β
π
Data Pipelines
Β
Content type:
News
en.apa.az
Β·
1d
1 day ago
Actions for Central Bank strengthens data governance for AI solutions
Redis
Data
Integration in Redis Cloud is now GA in AWS
Β
π
Data Pipelines
Β
Content type:
Blog
redis.io
Β·
6d
6 days ago
Actions for Redis Data Integration in Redis Cloud is now GA in AWS
When Feature Importance Lies: Target Encoding at the Noise Floor
Β
π
Data Pipelines
flyback.ai
Β·
1d
1 day ago
Β·
DEV
Actions for When Feature Importance Lies: Target Encoding at the Noise Floor
Celebal Technologies raises debt funding from BlackSoil
Β
π
Data Pipelines
Β
Content type:
News
theheadandtale.com
Β·
16h
16 hours ago
Actions for Celebal Technologies raises debt funding from BlackSoil
benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers,
dbt
+
Airflow
recipes.
Β
π
Python
Β
Content type:
Code
github.com
Β·
6d
6 days ago
Β·
Hacker News
Actions for benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.
Azerbaijani Central Bank set to adopt
data
Lakehouse system in 2026
Β
π
Data Pipelines
trend.az
Β·
1d
1 day ago
Actions for Azerbaijani Central Bank set to adopt data Lakehouse system in 2026
The Hidden Tax Killing Your ML Teamβs Velocity β And the Architecture Decision That Fixes It
Β
π
Data Pipelines
Β
Content type:
Blog
medium.com
Β·
5d
5 days ago
Actions for The Hidden Tax Killing Your ML Teamβs Velocity β And the Architecture Decision That Fixes It
IPO-bound
Databricks
reportedly eyes $175B valuation after hitting $5.4B revenue run rate β TFN
Β
π
Data Pipelines
techfundingnews.com
Β·
1d
1 day ago
Actions for IPO-bound Databricks reportedly eyes $175B valuation after hitting $5.4B revenue run rate β TFN
Amazon
SageMaker Unified Studio Notebooks now support EMR Serverless
Β
β‘
Apache Spark
aws.amazon.com
Β·
1d
1 day ago
Actions for Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
Operationalizing Property-Based Testing for
Data-Intensive
Scalable Computing Systems
Β
π
Data Pipelines
Β
Content type:
Academic
arxiv.org
Β·
17h
17 hours ago
Actions for Operationalizing Property-Based Testing for Data-Intensive Scalable Computing Systems
AI
Agents and the Fight for Customer
Data
Β
π
Data Pipelines
a16z.simplecast.com
Β·
5d
5 days ago
Actions for AI Agents and the Fight for Customer Data
AI
Security Best Practices for Regulated Industries
Β
π
Data Pipelines
orca.security
Β·
1d
1 day ago
Actions for AI Security Best Practices for Regulated Industries
DuckDB Storage
Engine
for MariaDB. When the Sea Lion Learns to Quack.
Β
π
Data Pipelines
mariadb.org
Β·
1d
1 day ago
Β·
Hacker News
Actions for DuckDB Storage Engine for MariaDB. When the Sea Lion Learns to Quack.
The Considerate
Data
Modeler
Β
π
Data Pipelines
oranlooney.com
Β·
6d
6 days ago
Β·
Hacker News
Actions for The Considerate Data Modeler
New comment by thaaff in "Ask HN: Who wants to be hired? (June 2026)"
Β
π
Data Pipelines
Β
Content type:
Discussion
news.ycombinator.com
Β·
6d
6 days ago
Β·
Hacker News
Actions for New comment by thaaff in "Ask HN: Who wants to be hired? (June 2026)"
Storage Insights
datasets
: Enabling org-wide operational discovery with activity insights
Β
π
Data Pipelines
Β
Content type:
Blog
cloud.google.com
Β·
1d
1 day ago
Actions for Storage Insights datasets: Enabling org-wide operational discovery with activity insights
New comment by aldoakhanov in "Ask HN: Who wants to be hired? (June 2026)"
Β
π
Python
castlefootyai.com
Β·
5d
5 days ago
Β·
Hacker News
Actions for New comment by aldoakhanov in "Ask HN: Who wants to be hired? (June 2026)"
Scaling Zero Copy from 1 Trillion to 120 Trillion Rows with File Federation
Β
π
Data Pipelines
engineering.salesforce.com
Β·
2d
2 days ago
Actions for Scaling Zero Copy from 1 Trillion to 120 Trillion Rows with File Federation
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help