article thumbnail

New AI text diffusion models break speed barriers by pulling words from noise

Ars Technica

On Thursday, Inception Labs released Mercury Coder , a new AI language model that uses diffusion techniques to generate text faster than conventional models. Traditional large language models build text from left to right, one token at a time. They use a technique called " autoregression."

Model 145
article thumbnail

Explaining Tokens — the Language and Currency of AI

NVIDIA AI Blog

Under the hood of every AI application are algorithms that churn through data in their own language, one based on a vocabulary of tokens. AI models process tokens to learn the relationships between them and unlock capabilities including prediction, generation and reasoning. What Is Tokenization? This process is known as tokenization.

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

AI firms follow DeepSeek’s lead, create cheaper models with “distillation”

Ars Technica

Leading artificial intelligence firms including OpenAI, Microsoft, and Meta are turning to a process called distillation in the global race to create AI models that are cheaper for consumers and businesses to adopt. Read full article Comments

Model 129
article thumbnail

Microsoft’s new Phi-4 AI models pack big performance in small packages

VentureBeat

Microsoft's new Phi-4 AI models deliver breakthrough performance in a compact size, processing text, images, and speech simultaneously while requiring less computing power than competitors. Read More

Model 139
article thumbnail

DeepMind's latest AI model can help robots fold origami and close Ziploc bags

Engadget

On Wednesday, the AI lab announced two new Gemini-based models it says will "lay the foundation for a new generation of helpful robots." The two-armed robot also understands all the instructions given to it in natural, everyday language. Since its debut at the end of last year, Gemini 2.0

Model 100
article thumbnail

How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs)

VentureBeat

A 1B small language model can beat a 405B large language model in reasoning tasks if provided with the right test-time scaling strategy. Read More

Language 137
article thumbnail

Apple will use its street view Maps photos to train AI models

Engadget

Apple plans to start using images it collects for Maps to train its AI models. In a disclosure spotted by 9to5Mac , the company said starting this month it would use images it captures to provide its Look Around feature for the additional purpose of training some of its generative AI models.

Map 82