Remove Evaluation Remove Learning Theory Remove Sample
article thumbnail

Stanford AI Lab Papers and Talks at NeurIPS 2021

Stanford AI Lab Blog

Kochenderfer Contact : philhc@stanford.edu Links: Paper Keywords : deep learning or neural networks, sparsity and feature selection, variational inference, (application) natural language and text processing Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss Authors : Jeff Z.

Contact 40
article thumbnail

Research directions Open Phil wants to fund in technical AI safety

The AI Alignment Forum

We think this adversarial style of evaluation and iteration is necessary to ensure an AI system has a low probability of catastrophic failure. Wed like to support more such evaluations, especially on scalable oversight protocols like AI debate. and Which rules are LLM agents happy to break, and which are they more committed to? .

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Google at ICLR 2023

Google Research AI blog

If you’re registered for ICLR 2023, we hope you’ll visit the Google booth to learn more about the exciting work we’re doing across topics spanning representation and reinforcement learning, theory and optimization, social impact, safety and privacy, and applications from generative AI to speech and robotics.

Google 105
article thumbnail

The Theoretical Reward Learning Research Agenda: Introduction and Motivation

The AI Alignment Forum

Some relevant criteria for evaluating a specification language include: How expressive is the language? What about inverse reinforcement learning? If a given specification learning learning algorithm is guaranteed to converge to a good specification, can we say anything about its sample complexity?

article thumbnail

Google at NeurIPS 2022

Google Research AI blog

Derrick Xin , Behrooz Ghorbani , Ankush Garg , Orhan Firat , Justin Gilmer Associating Objects and Their Effects in Video Through Coordination Games Erika Lu , Forrester Cole , Weidi Xie, Tali Dekel , William Freeman , Andrew Zisserman , Michael Rubinstein Increasing Confidence in Adversarial Robustness Evaluations Roland S.

Google 52
article thumbnail

AXRP Episode 40 - Jason Gross on Compact Proofs and Interpretability

The AI Alignment Forum

This is extremely expensive, but you could do a sampling-based probabilistic version to make it cheaper. Daniel Filan (00:28:50): If people remember my singular learning theory episodes , theyll get mad at you for saying that quadratics are all there is, but its a decent approximation. (00:28:56): Daniel Filan (00:19:52): Okay.

Model 52