www.anthropic.com (sitemap) · Scour

https://www.anthropic.com/research/towards-understanding-sycophancy-in-language-models

anthropic.com·73w

https://www.anthropic.com/research/towards-monosemanticity-decomposing-language-models-with-dictionary-learning

anthropic.com·73w

https://www.anthropic.com/research/towards-measuring-the-representation-of-subjective-global-opinions-in-language-models

anthropic.com·73w

https://www.anthropic.com/research/the-capacity-for-moral-self-correction-in-large-language-models

anthropic.com·73w

Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet

anthropic.com·80w·Hacker News

https://www.anthropic.com/research/superposition-memorization-and-double-descent

anthropic.com·73w

https://www.anthropic.com/research/studying-large-language-model-generalization-with-influence-functions

anthropic.com·73w

https://www.anthropic.com/research/specific-versus-general-principles-for-constitutional-ai

anthropic.com·73w

https://www.anthropic.com/research/softmax-linear-units

anthropic.com·73w

https://www.anthropic.com/research/scaling-laws-and-interpretability-of-learning-from-repeated-data

anthropic.com·73w

https://www.anthropic.com/research/red-teaming-language-models-to-reduce-harms-methods-scaling-behaviors-and-lessons-learned

anthropic.com·73w

https://www.anthropic.com/research/question-decomposition-improves-the-faithfulness-of-model-generated-reasoning

anthropic.com·73w

https://www.anthropic.com/research/measuring-progress-on-scalable-oversight-for-large-language-models

anthropic.com·73w

https://www.anthropic.com/research/measuring-faithfulness-in-chain-of-thought-reasoning

anthropic.com·73w

https://www.anthropic.com/research/language-models-mostly-know-what-they-know

anthropic.com·73w

Many-shot jailbreaking

anthropic.com·110w

Mapping the Mind of a Large Language Model

anthropic.com·103w

https://www.anthropic.com/research/influence-functions

anthropic.com·73w

Evaluating feature steering: A case study in mitigating social biases

anthropic.com·81w·Hacker News

https://www.anthropic.com/research/distributed-representations-composition-superposition

anthropic.com·73w

Log in to enable infinite scrolling