Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Data Engineering
🗄️ Data Engineering
data pipeline, ETL, data warehouse, Apache Spark
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
552
posts in
8.1
ms
benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers,
dbt
+
Airflow
recipes.
🐍
Python
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
Actions for benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.
Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
💬
LLMs
aws.amazon.com
·
18h
18 hours ago
Actions for Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
FOCUS specification eyes
AI
token economics as
AI
billing complexity hits a new frontier
⚙️
MLOps
siliconangle.com
·
1d
1 day ago
Actions for FOCUS specification eyes AI token economics as AI billing complexity hits a new frontier
New
Airflow
Previews a New Design Language for Chrysler
💬
Prompt Engineering
hagerty.com
·
4d
4 days ago
Actions for New Airflow Previews a New Design Language for Chrysler
I built a NAS with enterprise SAS drives, and the hidden costs nearly matched new SATA drives
📈
Business News
xda-developers.com
·
2h
2 hours ago
Actions for I built a NAS with enterprise SAS drives, and the hidden costs nearly matched new SATA drives
Central Bank strengthens
data
governance for
AI
solutions
⚙️
MLOps
Content type:
News
en.apa.az
·
1d
1 day ago
Actions for Central Bank strengthens data governance for AI solutions
Can
Snowflake
users get Photon-like
Spark
performance with Quanton?
💬
LLMs
quanton.dev
·
6d
6 days ago
·
Hacker News
Actions for Can Snowflake users get Photon-like Spark performance with Quanton?
Real Estate Lifecycle Analysis with
BigQuery
SQL Graph: Graph
Modeling
Beyond LLM Record Linkage
⚙️
MLOps
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for Real Estate Lifecycle Analysis with BigQuery SQL Graph: Graph Modeling Beyond LLM Record Linkage
When Feature Importance Lies: Target Encoding at the Noise Floor
💬
LLMs
flyback.ai
·
1d
1 day ago
·
DEV
Actions for When Feature Importance Lies: Target Encoding at the Noise Floor
This Is the Sub-$40,000 SUV That’s Supposed to Save Chrysler
💬
LLMs
thedrive.com
·
5d
5 days ago
Actions for This Is the Sub-$40,000 SUV That’s Supposed to Save Chrysler
New comment by unkownnomad110 in "Ask HN: Who wants to be hired? (June 2026)"
💬
Prompt Engineering
Content type:
Discussion
news.ycombinator.com
·
2d
2 days ago
·
Hacker News
Actions for New comment by unkownnomad110 in "Ask HN: Who wants to be hired? (June 2026)"
Spreadsheet native
data
platform
🐍
Python
getarkx.com
·
23h
23 hours ago
·
r/SideProject
Actions for Spreadsheet native data platform
New comment by wbazant in "Ask HN: Who wants to be hired? (June 2026)"
⚙️
MLOps
Content type:
Discussion
news.ycombinator.com
·
6d
6 days ago
·
Hacker News
Actions for New comment by wbazant in "Ask HN: Who wants to be hired? (June 2026)"
The Personal Cooling Device That Blows Cold
Air
Rather Than Just Moving It - Yanko Design
🎨
Hobbies
yankodesign.com
·
1d
1 day ago
Actions for The Personal Cooling Device That Blows Cold Air Rather Than Just Moving It - Yanko Design
Embedding
pipelines
are the new
ETL
💬
LLMs
Content type:
Blog
infoworld.com
·
5d
5 days ago
Actions for Embedding pipelines are the new ETL
10 MCP servers to connect LLMs with
databases
⚙️
MLOps
infoworld.com
·
2d
2 days ago
Actions for 10 MCP servers to connect LLMs with databases
Announcing the winners of the 2026
Databricks
Customer Awards
💬
LLMs
Content type:
Blog
databricks.com
·
20h
20 hours ago
Actions for Announcing the winners of the 2026 Databricks Customer Awards
ClickHouse Agents: Claude-powered agentic analytics, now in public beta
⚙️
MLOps
Content type:
Blog
clickhouse.com
·
21h
21 hours ago
Actions for ClickHouse Agents: Claude-powered agentic analytics, now in public beta
JSignPdf
🐍
Python
flathub.org
·
4d
4 days ago
Actions for JSignPdf
AI
Security Best Practices for Regulated Industries
⚙️
MLOps
orca.security
·
1d
1 day ago
Actions for AI Security Best Practices for Regulated Industries
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help