📊 LLM Evaluation - ibrahimsharaf · Scour

Presentation: Powering the Future: Building Your GenAI Infrastructure Stack 🤖AI Agents

·1d

Why every AI tooling decision needs a measurement 🤖AI Agents

noesisvision.substack.com·6d

Distributional Energy-Based Models for Uncertainty-Aware Structured LLM Reasoning 🚀LLM Deployment

Testing MiniMax M2.7 via API on three real ML and coding workflows 🚀LLM Deployment

andlukyane.com·3d·Hacker News

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation 💻Local AI

huggingface.co·2d

Transitivity Meets Cyclicity: Explicit Preference Decomposition for Dynamic Large Language Model Alignment 🧠LLMs

bhauman/clojure-mcp-light: Simple Clojure tooling for AI coding assistants 🎯LLM Finetuning

github.com·8h·r/functionalprogramming

How to A/B Test LLM Prompts Without Breaking Production 🚀LLM Deployment

benchwright.polsia.app·5d·DEV

HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top) 🎯LLM Finetuning

hwebench.com·5d·Hacker News

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self-Play 🤖AI Agents

Aether Mind – on-chain neural cognitive engine on a quantum-VQE L1 💻Local AI

huggingface.co·5d·Hacker News

An assessment of normalization and differential expression methods for miRNA-seq analysis using a realistic benchmark dataset 📊Retrieval Evaluation

biorxiv.org·6d

OpenMOSS/MOSS-Audio: MOSS-Audio is an open-source foundation model for unified audio understanding, enabling speech, sound, music, captioning, QA, and reasoning in real-world scenarios. 🔬Small LMs

MMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-Agent 💻Local AI

Show HN: Pokémon SVG Generation LLM Benchmark 📐Vector Search

svg-bench.fenx.work·6d·Hacker News

pyKinaXe: a fast and robust turnkey kinase activity profiler with high resolution 📐Vector Search

biorxiv.org·4d

deadtrees.earth-aerial: A Multi-Resolution Aerial Image Dataset for Tree Cover and Mortality Detection 📐Vector Search

Amazon Bedrock introduces new advanced prompt optimization and migration tool 🚀LLM Deployment

aws.amazon.com·6d

Anthropic commits $200M with Gates Foundation to deploy AI in global health, education, and agriculture 🔓Open Source AI

thenextweb.com·5d

Lightweight CNN-Based DDoS Detection for Resource-Constrained Edge Networks 💻Local AI

Sign up or log in to see more results

Log in to enable infinite scrolling