Comparison, Learning Theory and Trend

Research directions Open Phil wants to fund in technical AI safety

The AI Alignment Forum

FEBRUARY 7, 2025

Encoded reasoning in CoT and inter-model communication The recent trend of scaling up inference-time compute may offer a valuable advantage for catching models engaging in objectionable behaviors, since it allows us to essentially read their minds. VC theory ) and the generalization performance we see in practice. Karvonen et al.

Research

Research Fund Open Technique

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

MARCH 28, 2025

Daniel Filan (00:28:50): If people remember my singular learning theory episodes , theyll get mad at you for saying that quadratics are all there is, but its a decent approximation. (00:28:56): But maybe zooming out, the relevant comparison point here I think is not the number of parameters in the model.

Model

Model Network Train Training

Nonprofit Technology

Research directions Open Phil wants to fund in technical AI safety

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

Webinars

Stay Connected