RefusalBench: Generative Evaluation of Selective Refusal in Grounded LanguageModels
💬Prompt Engineering
Flag this post
Improving Diagnostic Performance on Small and Imbalanced Datasets Using Class-Based Input Image Composition
arxiv.org·8h
👁️Computer Vision
Flag this post
VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks
arxiv.org·8h
🧮SMT Solvers
Flag this post
[Tool] RE-Architect: Automated binary analysis with multiple decompilers + AI explanations
🔍Reverse Engineering
Flag this post
Prog8
💾Retro Computing
Flag this post
Create a MCP server from scratch
🛡️Error Handling
Flag this post
Minimalistic CLAUDE.md for new projects: Follow SOLID, DRY, YAGNI, KISS
🔨Incremental Compilation
Flag this post
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
arxiv.org·8h
🌳Tree-sitter
Flag this post
Are We Aligned? A Preliminary Investigation of the Alignment of Responsible AI Values between LLMs and Human Judgment
arxiv.org·8h
🎨Design Systems
Flag this post
Efficiency vs. Alignment: Investigating Safety and Fairness Risks in Parameter-Efficient Fine-Tuning of LLMs
arxiv.org·3d
📊Profile-Guided Optimization
Flag this post
The Complexity Cliff: Why Reasoning Models Work Right Up Until They Don't
💬Prompt Engineering
Flag this post
Loading...Loading more...