Remove Comparison Remove Learning Theory Remove Local
article thumbnail

Research directions Open Phil wants to fund in technical AI safety

The AI Alignment Forum

Wed also be keen to see comparisons with supervised finetuning, RLHF, and adversarial training where appropriate. More ambitiously, research like this could advance our understanding of learning mechanisms in general (cf. and could be useful for testing this theorys predictions. Karvonen et al. this Manifold market ).

article thumbnail

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

Daniel Filan (00:28:50): If people remember my singular learning theory episodes , theyll get mad at you for saying that quadratics are all there is, but its a decent approximation. (00:28:56): But maybe zooming out, the relevant comparison point here I think is not the number of parameters in the model.

Model 52