Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Data Engineering
🔧 Data Engineering
data pipelines, ETL, data lakes, Apache Spark
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
274
posts in
8.3
ms
FOCUS specification eyes
AI
token economics as
AI
billing complexity hits a new frontier
🔭
Observability
siliconangle.com
·
1d
1 day ago
Actions for FOCUS specification eyes AI token economics as AI billing complexity hits a new frontier
New comment by mkolarek in "Ask HN: Who wants to be hired? (June 2026)"
📊
Query Optimization
Content type:
PDF
markokolarek.com
·
3d
3 days ago
·
Hacker News
Actions for New comment by mkolarek in "Ask HN: Who wants to be hired? (June 2026)"
Central Bank strengthens
data
governance for
AI
solutions
🏎️
ClickHouse
Content type:
News
en.apa.az
·
1d
1 day ago
Actions for Central Bank strengthens data governance for AI solutions
Spreadsheet native
data
platform
🏎️
ClickHouse
getarkx.com
·
22h
22 hours ago
·
r/SideProject
Actions for Spreadsheet native data platform
What Went Wrong with
Data
Lakes
? A 15-Year Reality Check from the Field
🏎️
ClickHouse
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for What Went Wrong with Data Lakes? A 15-Year Reality Check from the Field
Embedding
pipelines
are the new
ETL
🗃️
Vector Databases
Content type:
Blog
infoworld.com
·
5d
5 days ago
Actions for Embedding pipelines are the new ETL
Automating Real-time
Data
Pipelines
: Deploying Pub/Sub to BigQuery with Dataflow Custom Template…
🏎️
ClickHouse
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for Automating Real-time Data Pipelines: Deploying Pub/Sub to BigQuery with Dataflow Custom Template…
New comment by unkownnomad110 in "Ask HN: Who wants to be hired? (June 2026)"
📊
Query Optimization
Content type:
Discussion
news.ycombinator.com
·
2d
2 days ago
·
Hacker News
Actions for New comment by unkownnomad110 in "Ask HN: Who wants to be hired? (June 2026)"
Cloudian closes gap between enterprise
AI
ambitions and messy production deployments
🗃️
Vector Databases
Content type:
News
blocksandfiles.com
·
23h
23 hours ago
Actions for Cloudian closes gap between enterprise AI ambitions and messy production deployments
15 years of Software Center – A Look in the Mirror and over the Front Windshield
🏛️
Software Architecture
Content type:
Blog
metrics.blogg.gu.se
·
5h
5 hours ago
Actions for 15 years of Software Center – A Look in the Mirror and over the Front Windshield
Integration Patterns: How To Choose for Your Architecture
🏛️
Software Architecture
Content type:
Blog
blog.n8n.io
·
1d
1 day ago
Actions for Integration Patterns: How To Choose for Your Architecture
Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
📊
Query Optimization
aws.amazon.com
·
18h
18 hours ago
Actions for Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
New comment by wbazant in "Ask HN: Who wants to be hired? (June 2026)"
🗄️
SQLite
Content type:
Discussion
news.ycombinator.com
·
6d
6 days ago
·
Hacker News
Actions for New comment by wbazant in "Ask HN: Who wants to be hired? (June 2026)"
When Feature Importance Lies: Target Encoding at the Noise Floor
🔭
Observability
flyback.ai
·
1d
1 day ago
·
DEV
Actions for When Feature Importance Lies: Target Encoding at the Noise Floor
dltHub Named 2026 Snowflake Startup Program Product Partner of the Year
📊
Query Optimization
prnewswire.co.uk
·
6d
6 days ago
Actions for dltHub Named 2026 Snowflake Startup Program Product Partner of the Year
AI
Security Best Practices for Regulated Industries
🔭
Observability
orca.security
·
1d
1 day ago
Actions for AI Security Best Practices for Regulated Industries
The Hidden Tax Killing Your ML Team’s Velocity – And the Architecture Decision That Fixes It
📈
Performance
Content type:
Blog
medium.com
·
4d
4 days ago
Actions for The Hidden Tax Killing Your ML Team’s Velocity – And the Architecture Decision That Fixes It
Archiving Years of
Dataverse
Audit History
💾
Storage Systems
techcommunity.microsoft.com
·
1d
1 day ago
Actions for Archiving Years of Dataverse Audit History
benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers,
dbt
+
Airflow
recipes.
🐧
Linux
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
Actions for benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.
Announcing
Spark
Connect on Amazon EMR Serverless: Interactive
PySpark
development, anywhere
🔬
CPU Architecture
Content type:
Blog
aws.amazon.com
·
20h
20 hours ago
Actions for Announcing Spark Connect on Amazon EMR Serverless: Interactive PySpark development, anywhere
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help