Remove Evaluation Remove Jing Remove Metrics
article thumbnail

Visual Blocks for ML: Accelerating machine learning prototyping with interactive tools

Google Research AI blog

It usually involves a cross-functional team of ML practitioners who fine-tune the models, evaluate robustness, characterize strengths and weaknesses, inspect performance in the end-use context, and develop the applications. Sign up to be notified when Visual Blocks for ML is publicly available.

article thumbnail

Visual captions: Using large language models to augment video conferences with dynamic visuals

Google Research AI blog

We measured the performance of the fine-tuned model with the token accuracy metric, i.e., the percentage of tokens in a batch that were correctly predicted by the model. Performance To evaluate the utility of the trained Visual Captions model, we invited 89 participants to perform 846 tasks.

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Google at EMNLP 2022

Google Research AI blog

Martins Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing Linlu Qiu*, Peter Shaw , Panupong Pasupat , Tianze Shi , Jonathan Herzig , Emily Pitler , Fei Sha , Kristina Toutanova MasakhaNER 2.0: Zhao , Yi Luan , Keith B.

Google 52
article thumbnail

NpTech Summary: Advocacy 2.0, Sketchcastes, and NpTech in Different Languages

Beth's Blog: How Nonprofits Can Use Social Media

"We can't talk about transparency, accountability and honest evaluation without addressing the contentious topic of failure. Some beta work is happening on this with TechSmith's " Jing Project ," an application that allows you easily embed screencasts into conversations on both PC and MAC platform.

Nptech 50
article thumbnail

Stanford AI Lab Papers and Talks at NeurIPS 2021

Stanford AI Lab Blog

Powers, Yianni Laloudakis, Sidhika Balachandar, Bowen Jing, Brandon Anderson, Stephan Eismann, Risi Kondor, Russ B. Townshend, Martin Vögele, Patricia Suriana, Alexander Derry, Alexander S. Altman, Ron O.

Contact 40
article thumbnail

How to Improve User Experience (and Behavior): Three Papers from Stanford's Alexa Prize Team

Stanford AI Lab Blog

These models perform well when evaluated by crowdworkers in carefully-controlled settings–typically written conversations with certain topical or length constraints. While there is a large body of prior work attempting to address this issue, most prior approaches use qualitative metrics based on surveys conducted in lab settings.

Alexa 40
article thumbnail

Google at NeurIPS 2022

Google Research AI blog

Derrick Xin , Behrooz Ghorbani , Ankush Garg , Orhan Firat , Justin Gilmer Associating Objects and Their Effects in Video Through Coordination Games Erika Lu , Forrester Cole , Weidi Xie, Tali Dekel , William Freeman , Andrew Zisserman , Michael Rubinstein Increasing Confidence in Adversarial Robustness Evaluations Roland S.

Google 52