Data Engineering

Feeds to Scour
SubscribedAll
Scoured 282 posts in 6.8 ms

Apache Iceberg™ 1.11 Released: A Smarter REST Catalog, Production-Ready Encryption and the Road to v4

 📦Parquet  Content type: Blog
snowflake.com·

Apache Iceberg v4: The Current State, the Proposals, and Why They Matter

 🌐Open Source

Claude Code for Research: Preventing Hallucinations

 Query Engines  Content type: News  Content type: Blog

Databricks wants to kill the “email me a file” problem for AI agent skills

 📦Parquet
thenewstack.io·

New comment by mkolarek in "Ask HN: Who wants to be hired? (June 2026)"

 📦Parquet  Content type: PDF

LakeQA: An Exploratory QA Benchmark over a Million-Scale Data Lake

 Query Engines  Content type: Academic
arxiv.org·

Choosing the right workflow orchestration service for your use case: Amazon MWAA and AWS Step Functions

 📊OLAP  Content type: Blog
aws.amazon.com·

Azerbaijani Central Bank set to adopt data Lakehouse system in 2026

 📊OLAP
trend.az·

benseverndev-oss/goldenmatch: Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.

 Query Engines  Content type: Code
github.com··Hacker News

Real-time data replication to your data warehouse, self-serve

 📊OLAP

The Hidden Tax Killing Your ML Team’s Velocity – And the Architecture Decision That Fixes It

 🏹Apache Arrow  Content type: Blog
medium.com·

DuckDB Storage Engine for MariaDB. When the Sea Lion Learns to Quack.

 Query Engines
mariadb.org··Hacker News

Redis Data Integration in Redis Cloud is now GA in AWS

 Query Engines  Content type: Blog
redis.io·

Microsoft just shared the frontier data engineering secrets

 Query Engines
mail.bycloud.ai·

15 years of Software Center – A Look in the Mirror and over the Front Windshield

 🌐Open Source  Content type: Blog
metrics.blogg.gu.se·

The Considerate Data Modeler

 📊OLAP

Central Bank strengthens data governance for AI solutions

 📊OLAP  Content type: News
en.apa.az·

From Legacy Custom Logging to Native Structured Logging in Dataflow

 Query Engines  Content type: Blog
medium.com
·

When Feature Importance Lies: Target Encoding at the Noise Floor

 Query Engines
flyback.ai··DEV

Gene dependency-informed inference of response to targeted cancer therapies

 Query Engines  Content type: Academic
nature.com·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help