Model Training

Feeds to Scour
SubscribedAll
Scoured 340 posts in 7.1 ms

Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training

 📉Deep Learning  Content type: Academic
arxiv.org·

If Claude Fable stops helping you, you’ll never know

 💬LLMs

RASFT: Rollout-Adaptive Supervised Fine-Tuning for Reasoning

 💬LLMs  Content type: Academic
arxiv.org·

Compatibility-Aware Dynamic Fine-Tuning for Large Language Models

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

PriFT: Prior-Support Guided Supervised Fine-Tuning

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

Data-Constrained Language Model Pretraining: Improved Regularization and Scaling Laws

 💬LLMs  Content type: Academic
arxiv.org·

Probabilistic Contrastive Pretraining for Multi-task ADME Property Prediction

 💬LLMs  Content type: Academic
arxiv.org·

MC-PDD: Masked Corpus-Level Pretraining Data Detection for Black-Box Large Language Models

 💬LLMs  Content type: Academic
arxiv.org·

Hubs or Fringes: Pretraining Data Selection via Web Graph Centrality

 💬LLMs  Content type: Academic
arxiv.org·

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

 💬LLMs  Content type: Academic
arxiv.org·

Dominant-Layer ZO: A Single Layer Dominates Zeroth-Order Fine-Tuning of LLMs

 🔍Interpretability  Content type: Academic
arxiv.org·

Small Experiments, Cheaper Decisions: A Case Study in Staged Promotion for Micro-Pretraining

 💬LLMs  Content type: Academic
arxiv.org·

Multi-Hop Knowledge Composition is Bound by Pretraining Exposure

 💬LLMs  Content type: Academic
arxiv.org·

Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

Predictable Scaling Laws of Optimal Hyperparameters for LLM Continued Pre-training

 💬LLMs  Content type: Academic
arxiv.org·

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

Parameter-Efficient Adapter Tuning for Tabular-Image Multimodal Learning

 🧠AI Research  Content type: Academic
arxiv.org·

Multilevel Stochastic Gradient Descent for Risk-Averse PDE-Constrained Optimization

 📉Deep Learning  Content type: Academic
arxiv.org·

ActiveMimic: Egocentric Video Pretraining with Active Perception

 💬LLMs  Content type: Academic
arxiv.org·

When Probing Accuracy Saturates, Fragility Resolves: A Complementary Metric for LLM Pre-Training Analysis

 💬LLMs  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help