Remove Alternative Remove Issue Remove Learning Theory
article thumbnail

The Theoretical Reward Learning Research Agenda: Introduction and Motivation

The AI Alignment Forum

However, if we want to design a chess-playing AI that can invent completely new strategies and entirely outclass human chess players, then we must use something analogous to reward maximisation (together with either a search algorithm or an RL algorithm, or some other alternative to these).

article thumbnail

Guest Post: Community and Civic Engagement in Museum Programs

Museum 2.0

Deeper community relationships through focus groups or community advising committees can further help museums connect with issues relevant to their communities while also hold the museum accountable for their responses. This can be accomplished through a variety of feedback methods conducted both inside and outside the museum.

Museum 49
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Research directions Open Phil wants to fund in technical AI safety

The AI Alignment Forum

Alternatives to adversarial training : Adversarial training (and the rest of todays best alignment techniques) have failed to create LLM agents that reliably avoid misaligned goals. Alternative approaches to mitigating AI risks These research areas lie outside the scope of the clusters above. Sheshadri et al., and Zeng et al.

article thumbnail

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

And a technical note: it needs to be in some first-order system or alternatively, you need to measure proof checking time as opposed to proof length. Daniel Filan (00:28:50): If people remember my singular learning theory episodes , theyll get mad at you for saying that quadratics are all there is, but its a decent approximation. (00:28:56):

Model 52