Model Quantization, Inference Optimization, GGUF Format, Privacy-preserving AI
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
arxiv.orgยท2d
EQuARX: Efficient Quantized AllReduce in XLA for Distributed Machine Learning Acceleration
arxiv.orgยท1d
Aligning Frozen LLMs by Reinforcement Learning: An Iterative Reweight-then-Optimize Approach
arxiv.orgยท1d
Less Data Less Tokens: Multilingual Unification Learning for Efficient Test-Time Reasoning in LLMs
arxiv.orgยท1d
RecLLM-R1: A Two-Stage Training Paradigm with Reinforcement Learning and Chain-of-Thought v1
arxiv.orgยท23h
Surgery-R1: Advancing Surgical-VQLA with Reasoning Multimodal Large Language Model via Reinforcement Learning
arxiv.orgยท23h
Reinforcement Learning from Human Feedback, Explained Simply
towardsdatascience.comยท2d
LOGICPO: Efficient Translation of NL-based Logical Problems to FOL using LLMs and Preference Optimization
arxiv.orgยท1d
Leveraging Large Language Models for Information Verification -- an Engineering Approach
arxiv.orgยท1d
Loading...Loading more...