~hush's gemlog
tilde.town·16h
🔲Proof Irrelevance
Flag this post
Here's how to use AI at work to avoid hallucinations and mistakes
euronews.com·14h
🤖AI Curation
Flag this post
Show HN: Halo – Vision Headphones
💿Optical Forensics
Flag this post
Becoming Superhuman
⚡Proof Automation
Flag this post
OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning
arxiv.org·2d
🔍Information Retrieval
Flag this post
Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought
arxiv.org·1d
💻Programming languages
Flag this post
Scaling Image Geo-Localization to Continent Level
arxiv.org·10h
🕳️Persistent Homology
Flag this post
Non-Convex Over-the-Air Heterogeneous Federated Learning: A Bias-Variance Trade-off
arxiv.org·10h
🧠Machine Learning
Flag this post
Jailbreak Mimicry: Automated Discovery of Narrative-Based Jailbreaks for Large Language Models
arxiv.org·3d
🧪Binary Fuzzing
Flag this post
Budgeted Multiple-Expert Deferral
arxiv.org·10h
📐Linear Algebra
Flag this post
Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study
arxiv.org·3d
🌊CBOR Streaming
Flag this post
SentiMaithili: A Benchmark Dataset for Sentiment and Reason Generation for the Low-Resource Maithili Language
arxiv.org·3d
⚙️Compression Benchmarking
Flag this post
Does In-IDE Calibration of Large Language Models work at Scale?
arxiv.org·3d
📏Code Metrics
Flag this post
I Tested 25+ AI Writing Tools, and This One Writes Better Than Most Humans (With Results)
⚡Proof Automation
Flag this post
Fine-Tuned Language Models for Domain-Specific Summarization and Tagging
arxiv.org·1d
📋Document Grammar
Flag this post
VOGUE: A Multimodal Dataset for Conversational Recommendation in Fashion
arxiv.org·4d
🏛Digital humanities
Flag this post
Irish-BLiMP: A Linguistic Benchmark for Evaluating Human and Language Model Performance in a Low-Resource Setting
arxiv.org·4d
🌳Context free grammars
Flag this post
Loading...Loading more...