Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Data Pipelines
🔄 Data Pipelines
Airflow, orchestration, ingestion, batch processing, streaming
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
345
posts in
11.1
ms
Apache
Spark
Real-Time Mode for Gaming: A Better Way to Do Real-Time Sessionization
🛠️
Data Engineering
Content type:
Blog
databricks.com
·
1w
1 week ago
Actions for Apache Spark Real-Time Mode for Gaming: A Better Way to Do Real-Time Sessionization
Central Bank strengthens
data
governance for
AI
solutions
🛠️
Data Engineering
Content type:
News
en.apa.az
·
1d
1 day ago
Actions for Central Bank strengthens data governance for AI solutions
Celebal Technologies raises debt funding from BlackSoil
🛠️
Data Engineering
Content type:
News
theheadandtale.com
·
15h
15 hours ago
Actions for Celebal Technologies raises debt funding from BlackSoil
benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers,
dbt
+
Airflow
recipes.
🐍
Python
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.
Microservices Solved One Problem. Then They Created Another.
🛠️
Data Engineering
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Microservices Solved One Problem. Then They Created Another.
Piper
: A Programmable Distributed Training System
🛠️
Data Engineering
Content type:
Academic
arxiv.org
·
16h
16 hours ago
Actions for Piper: A Programmable Distributed Training System
When Feature Importance Lies: Target Encoding at the Noise Floor
🛠️
Data Engineering
flyback.ai
·
1d
1 day ago
·
DEV
Actions for When Feature Importance Lies: Target Encoding at the Noise Floor
Scale. Speed. Trust: Three Imperatives for the
AI
Era
🛠️
Data Engineering
Content type:
Blog
blogs.cisco.com
·
3h
3 hours ago
Actions for Scale. Speed. Trust: Three Imperatives for the AI Era
Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
⚡
Apache Spark
aws.amazon.com
·
1d
1 day ago
Actions for Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
Icechunk Adopted by the National Weather Service: Earthmover Joins Booz Allen on NWS CIRRUS
🛠️
Data Engineering
Content type:
Blog
earthmover.io
·
6d
6 days ago
Actions for Icechunk Adopted by the National Weather Service: Earthmover Joins Booz Allen on NWS CIRRUS
AI
Security Best Practices for Regulated Industries
🛠️
Data Engineering
orca.security
·
1d
1 day ago
Actions for AI Security Best Practices for Regulated Industries
Jedify raises $24M to help companies arm
AI
agents with context on their business
🛠️
Data Engineering
techcrunch.com
·
6h
6 hours ago
Actions for Jedify raises $24M to help companies arm AI agents with context on their business
Iceberg Summit 2026: The Adoption Question Is Settled. Now What?
🛠️
Data Engineering
Content type:
Blog
snowflake.com
·
1w
1 week ago
Actions for Iceberg Summit 2026: The Adoption Question Is Settled. Now What?
New comment by unkownnomad110 in "Ask HN: Who wants to be hired? (June 2026)"
🛠️
Data Engineering
Content type:
Discussion
news.ycombinator.com
·
2d
2 days ago
·
Hacker News
Actions for New comment by unkownnomad110 in "Ask HN: Who wants to be hired? (June 2026)"
AI
Agents and the Fight for Customer
Data
🛠️
Data Engineering
a16z.simplecast.com
·
5d
5 days ago
Actions for AI Agents and the Fight for Customer Data
How Watershed’s
AI
Manages Emissions
Data
for Companies
🛠️
Data Engineering
Content type:
News
aimagazine.com
·
6h
6 hours ago
Actions for How Watershed’s AI Manages Emissions Data for Companies
New comment by aldoakhanov in "Ask HN: Who wants to be hired? (June 2026)"
🐍
Python
castlefootyai.com
·
5d
5 days ago
·
Hacker News
Actions for New comment by aldoakhanov in "Ask HN: Who wants to be hired? (June 2026)"
ocrmypdf 17.5.0 documentation
🔧
dbt
Content type:
Reference
ocrmypdf.readthedocs.io
·
14h
14 hours ago
Actions for ocrmypdf 17.5.0 documentation
IPO-bound
Databricks
reportedly eyes $175B valuation after hitting $5.4B revenue run rate — TFN
🛠️
Data Engineering
techfundingnews.com
·
1d
1 day ago
Actions for IPO-bound Databricks reportedly eyes $175B valuation after hitting $5.4B revenue run rate — TFN
Why Snowflake Matters Now More Than Ever
🛠️
Data Engineering
Content type:
News
Content type:
Blog
clouddb.substack.com
·
2d
2 days ago
·
Substack
Actions for Why Snowflake Matters Now More Than Ever
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help