Optimization Theory
Learning Dynamics Reveal a Hierarchy of Weight-Induced Layerwise Gram Metrics
📊Optimization Content type: AcademicA prism hierarchy of learning regimes in large linear autoencoders
📊Optimization Content type: AcademicAdaptive Learning Rates with Surrogate Probability for Follow-the-Perturbed-Leader
📊Optimization Content type: AcademicPC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training
📊Optimization Content type: AcademicVCIFBench: Evaluating Complex Instruction Following for Video Understanding
🧠Deep Learning Content type: AcademicNo more posts from gautam6599123's subscribed feeds.