Alignment Research, Model Robustness, Adversarial Examples, Risk Assessment

Sprinto Unveils Powerful New AI Capabilities To Tackle Risk and Compliance
prnewswire.com·18h
🤖AI
Flag this post
This is a wild use case!
threadreaderapp.com·2d
🤖AI
Flag this post
🚀LLM Overthinking? DTS makes LLM think shorter and answer smarter
reddit.com·12h·
Discuss: r/LLM
🤖AI
Flag this post
Reinforcement Learning-Driven Adaptive Exercise Recommendation via Dynamic Symptom-Medication Correlation
dev.to·2d·
Discuss: DEV
🔗Systems Thinking
Flag this post
Google introduces Private AI Compute to protect user data during AI inference
the-decoder.com·12h
🤖AI
Flag this post
Trustworthiness Calibration Framework for Phishing Email Detection Using Large Language Models
arxiv.org·2d
🤖AI
Flag this post
Reasoning Is All You Need for Urban Planning AI
arxiv.org·2d
🤖AI
Flag this post
BIPPO: Budget-Aware Independent PPO for Energy-Efficient Federated Learning Services
arxiv.org·23h
🔗Microservices
Flag this post
Rethinking Explanation Evaluation under the Retraining Scheme
arxiv.org·23h
🤖AI
Flag this post
From Catastrophic to Concrete: Reframing AI Risk Communication for Public Mobilization
arxiv.org·1d
🔗Systems Thinking
Flag this post
When Bias Pretends to Be Truth: How Spurious Correlations Undermine Hallucination Detection in LLMs
arxiv.org·1d
🤖AI
Flag this post
Unlock Your Simulations: Automated Parameter Tuning for Complex Models by Arvind Sundararajan
dev.to·1d·
Discuss: DEV
🔗Systems Thinking
Flag this post
Distributed Deep Learning for Medical Image Denoising with Data Obfuscation
arxiv.org·1d
🤖AI
Flag this post
DARN: Dynamic Adaptive Regularization Networks for Efficient and Robust Foundation Model Adaptation
arxiv.org·2d
🤖AI
Flag this post
The Role Of AI Companionship In My Life
reddit.com·2d·
Discuss: r/ChatGPT
🤖AI
Flag this post
Task-Adaptive Low-Dose CT Reconstruction
arxiv.org·1d
🤖AI
Flag this post
Personality over Precision: Exploring the Influence of Human-Likeness on ChatGPT Use for Search
arxiv.org·1d
🧠Philosophy of Mind
Flag this post