article thumbnail

Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages

Google Research AI blog

Posted by Yu Zhang, Research Scientist, and James Qin, Software Engineer, Google Research Last November, we announced the 1,000 Languages Initiative , an ambitious commitment to build a machine learning (ML) model that would support the world’s one thousand most-spoken languages, bringing greater inclusion to billions of people around the globe.

Language 140
article thumbnail

PaLM-E: An embodied multimodal language model

Google Research AI blog

Posted by Danny Driess, Student Researcher, and Pete Florence, Research Scientist, Robotics at Google Recent years have seen tremendous advances across machine learning domains, from models that can explain jokes or answer visual questions in a variety of languages to those that can produce images based on text descriptions.

Language 124
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Google Research, 2022 & Beyond: Language, Vision and Generative Models

Google Research AI blog

Transform modalities, or translate the world’s information into any language. I will begin with a discussion of language, computer vision, multi-modal models, and generative machine learning models. We want to solve complex mathematical or scientific problems. Diagnose complex diseases, or understand the physical world.

Language 132
article thumbnail

If you teach a chatbot how to read ASCII art, it will teach you how to make a bomb

TechSpot

University researchers have developed a way to "jailbreak" large language models like Chat-GPT using old-school ASCII art. The technique, aptly named "ArtPrompt," involves crafting an ASCII art "mask" for a word and then cleverly using the mask to coax the chatbot into providing a response it shouldn't. Read Entire Article

Arts 133
article thumbnail

Get inside the minds of first-time travelers in this new documentary

Mashable Tech

Refik Anadol and his creative team transformed all this data into a beautiful art piece. As the airline flying to more countries than any other, we are committed to connecting the world through the universal language of art and culture.” also a profound experience that transforms a person’s inner world.

Mind 98
article thumbnail

A vision-language approach for foundational UI understanding

Google Research AI blog

In “ Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus ”, accepted for publication at ICLR 2023 , we present a vision-only approach that aims to achieve general UI understanding completely from raw pixels. Spotlight drastically exceeded the state-of-the-art across four UI modeling tasks. Tappability - - - 87.9

Language 122
article thumbnail

Stable Diffusion: weird for visual arts, a boon for image compression algorithms?

TechSpot

Stable Diffusion is a machine learning algorithm capable of generating weirdly complex and (somewhat) believable images just from interpreting natural language descriptions. The text-to-image AI model is incredibly popular among users despite the fact that online art communities have started to reject AI-based images.

Arts 118