Ir al contenido principal

The latest research from Google

MELON: Reconstructing 3D objects from images with unknown poses

A person's prior experience and understanding of the world generally enables them to easily infer what an object looks like in whole, even if only looking at a few 2D pictures of it. Yet the capacity for a computer to reconstruct the shape of an object in 3D given only a few images has remained a difficult algorithmic problem for years. This fundamental computer vision task has applications ranging from the creation of e-commerce 3D models to autonomous vehicle navigation.

HEAL: A framework for health equity assessment of machine learning performance

Cappy: Outperforming and boosting large multi-task language models with a small scorer

Talk like a graph: Encoding graphs for large language models

Chain-of-table: Evolving tables in the reasoning chain for table understanding

Health-specific embedding tools for dermatology and pathology

Social learning: Collaborative learning with large language models

Croissant: a metadata format for ML-ready datasets

Google at APS 2024

VideoPrism: A foundational visual encoder for video understanding

Advances in private training for production on-device language models