smalltalk, forth, pascal, c, c++, go rust, scheme, lisp, ruby, erlang, prolog
Aligning Frozen LLMs by Reinforcement Learning: An Iterative Reweight-then-Optimize Approach
arxiv.org·1d
Step-Opt: Boosting Optimization Modeling in LLMs through Iterative Data Synthesis and Structured Validation
arxiv.org·1d
Loading...Loading more...