Language, Learning Theory and Metrics

Language

Learning Theory

Metrics

The Theoretical Reward Learning Research Agenda: Introduction and Motivation

The AI Alignment Forum

FEBRUARY 28, 2025

Some relevant criteria for evaluating a specification language include: How expressive is the language? What is the right way to quantify the differences and similarities between different goal specifications in a given specification language? Are there things it cannot express? How intuitive is it for humans to work with?

Research

Research Learning Method Policy

Stanford AI Lab Papers and Talks at NeurIPS 2021

Stanford AI Lab Blog

DECEMBER 6, 2021

Kochenderfer Contact : philhc@stanford.edu Links: Paper Keywords : deep learning or neural networks, sparsity and feature selection, variational inference, (application) natural language and text processing Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss Authors : Jeff Z.

Contact

Contact Learning Theory Authoring Offline

Join 12,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

The Everyday Donor: Unlocking Prospecting Segments Through Behavior Analysis

MORE WEBINARS

Trending Sources

Other Papers About the Theory of Reward Learning

The AI Alignment Forum

FEBRUARY 28, 2025

Goodhart's Law in Reinforcement Learning As you probably know, "Goodhart's Law" is an informal principle which says that "if a proxy is used as a target, it will cease to be a good proxy". This paper is also discussed in this post (Paper 4). For details, see the full paper. This paper is discussed in more detail in this post.

Learning

Learning Discussion Classes Policy

Webinars

The Everyday Donor: Unlocking Prospecting Segments Through Behavior Analysis

MORE WEBINARS

Research directions Open Phil wants to fund in technical AI safety

The AI Alignment Forum

FEBRUARY 7, 2025

In either of these settings, theres a chance that the LLMs will write messages that encode meaning beyond the natural language definitions of the words used. Externalizing reasoning: It could be safer to have much smaller language models which put more reasoning into natural language. and Mathew et al. Roger and Greenblatt ).

Research

Research Fund Open Technique

Moving from Red AI to Green AI, Part 1: How to Save the Environment and Reduce Your Hardware Costs

DataRobot

APRIL 21, 2022

They are used for different applications, but nonetheless they suggest that the development in infrastructure (access to GPUs and TPUs for computing) and the development in deep learning theory has led to very large models. For us, we believe in using efficiency metrics in machine learning software.

Green

Green Environment Metrics Measure

Google at NeurIPS 2022

Google Research AI blog

NOVEMBER 28, 2022

Platt , Fernando Pereira , Dale Schuurmans Keynote Speakers The Data-Centric Era: How ML is Becoming an Experimental Science Isabelle Guyon The Forward-Forward Algorithm for Training Deep Neural Networks Geoffrey Hinton Outstanding Paper Award Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding Chitwan Saharia , William Chan (..)

Google

Google Language Tutorial Offline

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

MARCH 28, 2025

And the way you said it just then, it sounded more like the first one: heres a new nice metric of how good your mechanistic explanation is. 00:26:47): And so what this gives us is an interaction metric where we can measure how bad this hypothesis is. But I dont know, it feels kind of surprising for that to be the explanation.

Model

Model Network Training Train

Nonprofit Technology

The Theoretical Reward Learning Research Agenda: Introduction and Motivation

Stanford AI Lab Papers and Talks at NeurIPS 2021

Webinars

Trending Sources

Other Papers About the Theory of Reward Learning

Webinars

Research directions Open Phil wants to fund in technical AI safety

Moving from Red AI to Green AI, Part 1: How to Save the Environment and Reduce Your Hardware Costs

Google at NeurIPS 2022

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

Stay Connected