🧮 Theorem Proving - matmat · Scour

Beyond Benchmarks: Testing Open-Source LLMs in Multi-Agent Workflows

blog.scottlogic.com·2d

⚡Performance Mythology

Flag this post

Formal or not formal? That is the question in AI for theorem proving.

xenaproject.wordpress.com·6d·

Discuss: Lobsters, Hacker News

Flag this post

Modern Perfect Hashing

blog.sesse.net·1d·

Discuss: Hacker News

🧪Binary Fuzzing

Flag this post

The New Calculus of AI-Based Coding

blog.joemag.dev·1d·

Discuss: Hacker News

🔄Reproducible Builds

Flag this post

PPO for LLMs: A Guide for Normal People

cameronrwolfe.substack.com·1d·

Discuss: Substack

🔗Constraint Handling

Flag this post

The Trojan Example: Jailbreaking LLMs through Template Filling and Unsafety Reasoning

arxiv.org·2d

🌐NetworkProtocols

Flag this post

Think before Recommendation: Autonomous Reasoning-enhanced Recommender

arxiv.org·1d

🎯Content Recommendation

Flag this post

Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction

arxiv.org·1d

⚔️Lean Tactics

Flag this post

The Shift to Synthetic Data Markets: How to Prepare Your C# Applications for 2026

dev.to·8h·

Discuss: DEV

⚡Proof Automation

Flag this post

PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling

arxiv.org·4h

🔲Cellular Automata

Flag this post

Inferring Group Intent as a Cooperative Game. An NLP-based Framework for Trajectory Analysis using Graph Transformer Neural Network

arxiv.org·4h

🔲Cellular Automata

Flag this post

EthVault: A Secure and Resource-Conscious FPGA-Based Ethereum Cold Wallet

arxiv.org·4h

🌊Stream Ciphers

Flag this post

Emotion-Coherent Reasoning for Multimodal LLMs via Emotional Rationale Verifier

arxiv.org·1d

🎯Dependent Parsing

Flag this post

Recognizing internal states in AI: evidence from patterned preferences in large language models

arxiv.org·1d

🤖Automated Parsing

Flag this post

A Comparison of Conversational Models and Humans in Answering Technical Questions: the Firefox Case

arxiv.org·1d

💻Programming languages

Flag this post

MCP Gateway and Registry: Enterprise-Grade Tool Governance for AI Agents

github.com·7h·

Discuss: Hacker News

🏠Homelab Orchestration

Flag this post

Reasoning Visual Language Model for Chest X-Ray Analysis

arxiv.org·4h

🏺Computational Archaeology

Flag this post

Automated Cluster Resource Orchestration via Predictive Load Balancing and Reinforcement Learning

dev.to·19h·

Discuss: DEV

👁️Observatory Systems

Flag this post

Cross-Paradigm Graph Backdoor Attacks with Promptable Subgraph Triggers

arxiv.org·1d

🌐BGP Security

Flag this post

Code-enabled language models can outperform reasoning models on diverse tasks

arxiv.org·2d

🧠Intelligence Compression

Flag this post

Loading more...