Data Science Weekly – Issue 626
datascienceweekly.substack.com·2d·
Discuss: Substack
🏗data engineering
Flag this post
UNSEEN: Enhancing Dataset Pruning from a Generalization Perspective
arxiv.org·5d
🧭Vector Databases
Flag this post
Building an AWS-Based RAG Pipeline
dev.to·1d·
Discuss: DEV
🔄Feed Aggregation
Flag this post
Benchmarking KDB-X vs. QuestDB, ClickHouse, TimescaleDB and InfluxDB
kx.com·5d·
Discuss: Hacker News
🏁Benchmark Frameworks
Flag this post
Generate RAG evaluation datasets from a single prompt (1K to docs)
alexjacobs08.github.io·5d·
🏗data engineering
Flag this post
Discovering physical laws with parallel symbolic enumeration
nature.com·1d
🔢NumPy
Flag this post
Apache Iceberg vs. Databricks – benchmarked
olake.io·3d·
Discuss: Hacker News
🏗data engineering
Flag this post
How to Build an Over-Engineered Retrieval System
towardsdatascience.com·4d
🔄Feed Aggregation
Flag this post
Hachi: An Image Search Engine
eagledot.xyz·3d·
📇Indexing Strategies
Flag this post
The 5 FREE Must-Read Books for Every Data Scientist
kdnuggets.com·4d
🐍Scientific Python
Flag this post
Edition 3 – Journalism from the command line, part 1
buttondown.com·5d·
Discuss: Hacker News
📋CSV Processing
Flag this post
Taming data chaos: building AI-ready data platforms for the enterprise
blocksandfiles.com·2d
🏛️Lakehouse Architecture
Flag this post
K-anonymity, the parent of all privacy definitions
desfontain.es·5d·
🔐Privacy Engineering
Flag this post
Show HN: We built an AI tool for working with LLM chat log datasets
hyperparam.app·3d·
Discuss: Hacker News
📊Column Stores
Flag this post
Metadata: How data about your data is optimal for AI
datasciencecentral.com·3d
🗂️Metadata Management
Flag this post
Building a Database from Scratch
stym06.github.io·5d·
⚙️Database Internals
Flag this post
Google is a Leader in the 2025 Gartner® Magic Quadrant for Cloud Database Management Systems
cloud.google.com·1d
🏛️Lakehouse Architecture
Flag this post
Enhanced Waste Stream Characterization via Multi-Modal Data Fusion and Predictive Analytics
dev.to·1d·
Discuss: DEV
🧭Navigation Algorithms
Flag this post
Event Sourcing in Go: From Zero to Production
skoredin.pro·4d·
🌊Stream Processing
Flag this post
People test Nano Banana with PDF paper to whiteboard. I did the exact opposite
quickchat.ai·9h·
Discuss: Hacker News
📓Jupyter
Flag this post