Dataset Curation

Feeds to Scour
SubscribedAll
Scoured 217 posts in 7.1 ms

Your ML Model Is Only as Good as the Dataset You Give It. Most Organisations Give It the Wrong One.

 🤖Large Language Model  Content type: Blog
medium.com·

Phase prediction in high-entropy alloys through uncertainty sampling and symbolic classification-based parameter discovery

 🤖AI  Content type: Academic
nature.com·

Optimize blueprint extraction accuracy in Amazon Bedrock Data Automation

 🤖Large Language Model  Content type: Blog
aws.amazon.com·

Tumour evolution as ground truth for cancer whole-genome sequencing

 👁️Computer Vision  Content type: Academic
biorxiv.org·

From Simulation to Real-World: An In-Field 6D Pose Dataset and Baseline for Robotic Strawberry Harvesting

 👁️Computer Vision  Content type: Academic
arxiv.org·

I built an open-source platform for ML benchmarks and leaderboards

 🤖Large Language Model

aeriesec/orgforge: Synthetic corporate dataset generator for AI agent evaluation.

 🤖AI  Content type: Code
github.com··Hacker News

Ask HN: What has been the fate of code review?

 🤖AI  Content type: Discussion

Law Professors Prefer AI over Peer Answers

 🤖Large Language Model  Content type: Academic

AI Is Already Giving Medical Conclusions. Are They Any Good?

 🤖AI  Content type: Academic  Content type: Blog
blog.citp.princeton.edu·

How LLMs are Actually Trained

 🤖Large Language Model  Content type: News  Content type: Blog
blog.algomaster.io·

How AWS DevOps Agent evaluates telemetry tools for agentic readiness

 🤖AI  Content type: Blog
bronto.io··Hacker News

Sports and Spelling: The Benefits of Combining Physical Activity with English Learning

 🤖Large Language Model  Content type: Blog
write.as·

A Human-Augmenting Agentic Workflow for Causal Inference

 🤖AI  Content type: Blog

Who corners AI profits? Samsung’s labour showdown sparks debate

 🤖AI  Content type: News
scroll.in·

Show HN: JazzBench, an LLM reasoning benchmark using jazz improvisation

 🤖AI  Content type: Blog
flatnine.co··Hacker News

Pythia 1.4B reproduces 3.6% of training samples verbatim given 950-token prompts

 🤖Large Language Model  Content type: Blog
ret2libc.com··Hacker News
Less-relevant results

Agentic AI and the ad stack: who controls the buying layer now?

 🤖AI
ppc.land·

How to Select Your POI Data Provider | Evaluation Framework for Quality & Coverage

 👁️Computer Vision  Content type: Blog
mapbox.com·

Robust discovery of mutational signatures using power posteriors

 🩻Medical Image Analysis
journals.plos.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help