Evaluation, Learning Theory and Summary

Search:

DAY

WEEK

MONTH

YEAR

Select your country:
Sign up | Log in

Evaluation

Learning Theory

Summary

Other Papers About the Theory of Reward Learning

The AI Alignment Forum

FEBRUARY 28, 2025

Published on February 28, 2025 7:26 PM GMT This is the seventh post in the theoretical reward learning sequence , which starts in this post. The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret In this paper , we look at what happens when a learnt reward function is optimised.

Learning

Learning Discussion Classes Policy

The Theoretical Reward Learning Research Agenda: Introduction and Motivation

The AI Alignment Forum

FEBRUARY 28, 2025

Some relevant criteria for evaluating a specification language include: How expressive is the language? The Rest of this Sequence In the coming entries of this sequence, I will provide in-depth summaries of some of my papers, and explain their setup and results in more detail (but less detail than what is provided in the papers themselves).

Research

Research Learning Method Policy

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

MARCH 28, 2025

Is that a fine, very brief summary of this? Jason Gross (00:27:59): Yeah, I think thats a pretty good summary of the theoretical approach. Daniel Filan (00:28:50): If people remember my singular learning theory episodes , theyll get mad at you for saying that quadratics are all there is, but its a decent approximation. (00:28:56):

Model

Model Network Training Train

Webinars

The Everyday Donor: Unlocking Prospecting Segments Through Behavior Analysis

MORE WEBINARS

Nonprofit Technology

Other Papers About the Theory of Reward Learning

The Theoretical Reward Learning Research Agenda: Introduction and Motivation

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

Webinars

Stay Connected