SKATE, a Scalable Tournament Eval: Weaker LLMs differentiate between stronger ones using verifiable challenges
arxiv.orgยท6h
Predicted impact of LLM use on developer ecosystems
shape-of-code.comยท12h
Measuring intelligence and reverse-engineering goals
lesswrong.comยท8h
Build your custom Phoenix phx.new generator
victorbjorklund.comยท16h
Loading...Loading more...