Remove Language Remove Learning Theory Remove Websites
article thumbnail

Stanford AI Lab Papers and Talks at NeurIPS 2021

Stanford AI Lab Blog

Linderman, David Sussillo Contact : jsmith14@stanford.edu Links: Paper | Website Keywords : recurrent neural networks, switching linear dynamical systems, interpretability, fixed points Compositional Transformers for Scene Generation Authors : Drew A. Smith, Scott W. Mel, Ben Sorscher, Alex H. Williams, Surya Ganguli, Lisa M.

Contact 40
article thumbnail

Stanford AI Lab Papers and Talks at ICLR 2022

Stanford AI Lab Blog

Manning, Jure Leskovec Contact : xikunz2@cs.stanford.edu Award nominations: Spotlight Links: Paper | Website Keywords : knowledge graph, question answering, language model, commonsense reasoning, graph neural networks, biomedical qa Fast Model Editing at Scale Authors : Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D.

Contact 40
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Timaeus in 2024

The AI Alignment Forum

Published on February 20, 2025 11:54 PM GMT TLDR: We made substantial progress in 2024: We published a series of papers that verify key predictions of Singular Learning Theory (SLT) [ 1 , 2 , 3 , 4 , 5 , 6 ]. The S4 correspondence in small language models. Alignment).

article thumbnail

Research directions Open Phil wants to fund in technical AI safety

The AI Alignment Forum

In either of these settings, theres a chance that the LLMs will write messages that encode meaning beyond the natural language definitions of the words used. Externalizing reasoning: It could be safer to have much smaller language models which put more reasoning into natural language. and Mathew et al. Roger and Greenblatt ).

article thumbnail

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

Daniel Filan (00:28:50): If people remember my singular learning theory episodes , theyll get mad at you for saying that quadratics are all there is, but its a decent approximation. (00:28:56): Whereas in the crosscoder paper, language modeling doesnt seem like the kind of thing that is going to be very symmetric.

Model 52