Skip to main content

The latest research from Google

Evaluating speech synthesis in many languages with SQuId

Previously, we presented the 1,000 languages initiative and the Universal Speech Model with the goal of making speech and language technologies available to billions of users around the world. Part of this commitment involves developing high-quality speech synthesis technologies, which build upon projects such as VDTTS and AudioLM, for users that speak many different languages.

Visual captions: Using large language models to augment video conferences with dynamic visuals

AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR

Retrieval-augmented visual-language pre-training

Large sequence models for software development activities

Foundation models for reasoning on charts

Barkour: Benchmarking animal-level agility with quadruped robots

Differentially private clustering for large-scale datasets

Google Research at I/O 2023

Resolving code review comments with ML

Making ML models differentially private: Best practices and open challenges