Remove Language Remove Learning Theory Remove Metrics
article thumbnail

The Theoretical Reward Learning Research Agenda: Introduction and Motivation

The AI Alignment Forum

Some relevant criteria for evaluating a specification language include: How expressive is the language? What is the right way to quantify the differences and similarities between different goal specifications in a given specification language? Are there things it cannot express? How intuitive is it for humans to work with?

article thumbnail

Stanford AI Lab Papers and Talks at NeurIPS 2021

Stanford AI Lab Blog

Kochenderfer Contact : philhc@stanford.edu Links: Paper Keywords : deep learning or neural networks, sparsity and feature selection, variational inference, (application) natural language and text processing Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss Authors : Jeff Z.

Contact 40
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Other Papers About the Theory of Reward Learning

The AI Alignment Forum

Goodhart's Law in Reinforcement Learning As you probably know, "Goodhart's Law" is an informal principle which says that "if a proxy is used as a target, it will cease to be a good proxy". This paper is also discussed in this post (Paper 4). For details, see the full paper. This paper is discussed in more detail in this post.

article thumbnail

Research directions Open Phil wants to fund in technical AI safety

The AI Alignment Forum

In either of these settings, theres a chance that the LLMs will write messages that encode meaning beyond the natural language definitions of the words used. Externalizing reasoning: It could be safer to have much smaller language models which put more reasoning into natural language. and Mathew et al. Roger and Greenblatt ).

article thumbnail

Moving from Red AI to Green AI, Part 1: How to Save the Environment and Reduce Your Hardware Costs

DataRobot

They are used for different applications, but nonetheless they suggest that the development in infrastructure (access to GPUs and TPUs for computing) and the development in deep learning theory has led to very large models. For us, we believe in using efficiency metrics in machine learning software.

Green 145
article thumbnail

Google at NeurIPS 2022

Google Research AI blog

Platt , Fernando Pereira , Dale Schuurmans Keynote Speakers The Data-Centric Era: How ML is Becoming an Experimental Science Isabelle Guyon The Forward-Forward Algorithm for Training Deep Neural Networks Geoffrey Hinton Outstanding Paper Award Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding Chitwan Saharia , William Chan (..)

Google 52
article thumbnail

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

And the way you said it just then, it sounded more like the first one: heres a new nice metric of how good your mechanistic explanation is. 00:26:47): And so what this gives us is an interaction metric where we can measure how bad this hypothesis is. But I dont know, it feels kind of surprising for that to be the explanation.

Model 52