Sensitivity of Small Language Models to Fine-tuning Data Contamination
arxiv.org·3h
✅Pydantic
Flag this post
MedVoiceBias: A Controlled Study of Audio LLM Behavior in Clinical Decision-Making
arxiv.org·3h
📨Kafka
Flag this post
Reasoning Up the Instruction Ladder for Controllable Language Models
arxiv.org·1d
🎀Decorators
Flag this post
MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision andLanguage Models
🎀Decorators
Flag this post
What Prompt Engineering in 2025 Actually Looks Like (When You’re Trying to Build for Real)
🏗️Design Patterns
Flag this post
Tool-Driven Behavioral Directives: How to Scale LLM Agents Without Prompt Spaghetti
🎯Context Managers
Flag this post
Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles
arxiv.org·3h
🔁Python Itertools
Flag this post
Consistency Is Not Always Correct: Towards Understanding the Role of Exploration in Post-Training Reasoning
arxiv.org·3h
🔍Fuzzy Finders
Flag this post
Monitoring Autonomous Systems Telemetry: Building an HFT-Grade Network Analysis Pipeline for UDP-based Protocols
🔄Data Pipelines
Flag this post
EASE: Practical and Efficient Safety Alignment for Small Language Models
arxiv.org·3h
✅Pydantic
Flag this post
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
arxiv.org·3h
🔄Data Pipelines
Flag this post
Loading...Loading more...