This week marks the start of the 2021 Conference on Computer Vision and Pattern Recognition (CVPR 2021), the premier annual computer vision event consisting of the main conference, workshops and tutorials. As a leader in computer vision research and a Champion Level Sponsor, Google will have a strong presence at CVPR 2021, with over 70 publications accepted, along with the organization of and participation in multiple workshops and tutorials.
If you are participating in CVPR this year, please visit our virtual booth to learn about Google research into the next generation of intelligent systems that utilize the latest machine learning techniques applied to various areas of machine perception.
You can also learn more about our research being presented at CVPR 2021 in the list below (Google affiliations in bold).
Organizing Committee Members
General Chair: Rahul Sukthankar Finance Chair: Ramin Zabih Workshop Chair: Caroline Pantofaru Area Chairs: Chen Sun, Golnaz Ghiasi, Jonathan Barron, Kostas Rematas, Negar Rostamzadeh, Noah Snavely, Sanmi Koyejo, Tsung-Yi Lin
Publications
Cross-Modal Contrastive Learning for Text-to-Image Generation (see the blog post) Han Zhang, Jing Yu Koh, Jason Baldridge, Honglak Lee*, Yinfei Yang
Learning Graph Embeddings for Compositional Zero-Shot Learning Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata
SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Nießner
3D-MAN: 3D Multi-Frame Attention Network for Object Detection Zetong Yang*, Yin Zhou, Zhifeng Chen, Jiquan Ngiam
MIST: Multiple Instance Spatial Transformer Baptiste Angles, Yuhe Jin, Simon Kornblith, Andrea Tagliasacchi, Kwang Moo Yi
OCONet: Image Extrapolation by Object Completion Richard Strong Bowen*, Huiwen Chang, Charles Herrmann*, Piotr Teterwak*, Ce Liu, Ramin Zabih
Ranking Neural Checkpoints Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong
LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces From Video Using Pose and Lighting Normalization Avisek Lahiri, Vivek Kwatra, Christian Frueh, John Lewis, Chris Bregler
Differentiable Patch Selection for Image Recognition Jean-Baptiste Cordonnier*, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner
HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Rohit Pandey, Cem Keskin, Ruofei Du, Deqing Sun, Sofien Bouaziz, Sean Fanello, Ping Tan, Yinda Zhang
VIP-DeepLab: Learning Visual Perception With Depth-Aware Video Panoptic Segmentation (see the blog post) Siyuan Qiao*, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
DeFMO: Deblurring and Shape Recovery of Fast Moving Objects Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, Marc Pollefeys
HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps Lu Mi, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao, Chen Sun, Cordelia Schmid, Nir Shavit, Yuning Chai, Dragomir Anguelov
Wide-Baseline Relative Camera Pose Estimation With Directional Learning Kefan Chen, Noah Snavely, Ameesh Makadia
MobileDets: Searching for Object Detection Architectures for Mobile Accelerators Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen
SMURF: Self-Teaching Multi-Frame Unsupervised RAFT With Full-Image Warping Austin Stone, Daniel Maurer, Alper Ayvaci, Anelia Angelova, Rico Jonschkowski
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut
Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool
MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking Jennifer Jang, Heinrich Jiang
Repopulating Street Scenes Yifan Wang*, Andrew Liu, Richard Tucker, Jiajun Wu, Brian L. Curless, Steven M. Seitz, Noah Snavely
MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers (see the blog post) Huiyu Wang*, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
IBRNet: Learning Multi-View Image-Based Rendering Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser
From Points to Multi-Object 3D Reconstruction Francis Engelmann*, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari
Learning Compositional Representation for 4D Captures With Neural ODE Boyan Jiang, Yinda Zhang, Xingkui Wei, Xiangyang Xue, Yanwei Fu
Guided Integrated Gradients: An Adaptive Path Method for Removing Noise Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, Tolga Bolukbasi
De-Rendering the World’s Revolutionary Artefacts Shangzhe Wu*, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, Angjoo Kanazawa
Spatiotemporal Contrastive Video Representation Learning Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui
Decoupled Dynamic Filter Networks Jingkai Zhou, Varun Jampani, Zhixiong Pi, Qiong Liu, Ming-Hsuan Yang
NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering Using RGB Cameras Xin Suo, Yuheng Jiang, Pei Lin, Yingliang Zhang, Kaiwen Guo, Minye Wu, Lan Xu
Regularizing Generative Adversarial Networks Under Limited Data Hung-Yu Tseng*, Lu Jiang, Ce Liu, Ming-Hsuan Yang, Weilong Yang
SceneGraphFusion: Incremental 3D Scene Graph Prediction From RGB-D Sequences Shun-Cheng Wu, Johanna Wald, Keisuke Tateno, Nassir Navab, Federico Tombari
NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, Jonathan T. Barron
Adversarially Adaptive Normalization for Single Domain Generalization Xinjie Fan*, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, Mingyuan Zhou
Adaptive Prototype Learning and Allocation for Few-Shot Segmentation Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, Joongkyu Kim
Adversarial Robustness Across Representation Spaces Pranjal Awasthi, George Yu, Chun-Sung Ferng, Andrew Tomkins, Da-Cheng Juan
Background Splitting: Finding Rare Classes in a Sea of Background Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon Fatahalian
Searching for Fast Model Families on Datacenter Accelerators Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc Le, Norman P. Jouppi
Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild With Pose Annotations (see the blog post) Adel Ahmadyan, Liangkai Zhang, Jianing Wei, Artsiom Ablavatski, Matthias Grundmann
CutPaste: Self-Supervised Learning for Anomaly Detection and Localization Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, Tomas Pfister
Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food Quin Thames, Arjun Karpur, Wade Norris, Fangting Xia, Liviu Panait, Tobias Weyand, Jack Sim
CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning Chen Wei*, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang
DetectoRS: Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution Siyuan Qiao, Liang-Chieh Chen, Alan Yuille
DeRF: Decomposed Radiance Fields Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi
Variational Transformer Networks for Layout Generation (see the blog post) Diego Martin Arroyo, Janis Postels, Federico Tombari
Rich Features for Perceptual Quality Assessment of UGC Videos Yilin Wang, Junjie Ke, Hossein Talebi, Joong Gon Yim, Neil Birkbeck, Balu Adsumilli, Peyman Milanfar, Feng Yang
Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds Li Yi, Boqing Gong, Thomas Funkhouser
Neural Descent for Visual 3D Human Pose and Shape Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji
Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
LASR: Learning Articulated Shape Reconstruction From a Monocular Video Gengshan Yang*, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu
MoViNets: Mobile Video Networks for Efficient Video Recognition Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong
No Shadow Left Behind: Removing Objects and Their Shadows Using Approximate Lighting and Geometry Edward Zhang, Ricardo Martin-Brualla, Janne Kontkanen, Brian Curless
On Robustness and Transferability of Convolutional Neural Networks Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic
Robust and Accurate Object Detection via Adversarial Learning Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong
To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Benjamin Caine, Vijay Vasudevan, Xiao Zhang, Dragomir Anguelov
Bottleneck Transformers for Visual Recognition Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani
Faster Meta Update Strategy for Noise-Robust Deep Learning Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang
Correlated Input-Dependent Label Noise in Large-Scale Image Classification Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent
Learned Initializations for Optimizing Coordinate-Based Neural Representations Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng
Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation Golnaz Ghiasi, Yin Cui, Aravind Srinivas*, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph
Function4D: Real-Time Human Volumetric Capture From Very Sparse Consumer RGBD Sensors Tao Yu, Zerong Zheng, Kaiwen Guo, Pengpeng Liu, Qionghai Dai, Yebin Liu
RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin Elsayed, Alex Bewley, Xiao Zhang, Cristian Sminchisescu, Dragomir Anguelov
NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, Daniel Duckworth
Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, Leonidas Guibas
Taskology: Utilizing Task Relations at Scale Yao Lu, Sören Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad*, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon
Omnimatte: Associating Objects and Their Effects in Video (see the blog post) Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein
AutoFlow: Learning a Better Training Set for Optical Flow Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, and Ce Liu
Unsupervised Multi-Source Domain Adaptation Without Access to Source Data Sk Miraj Ahmed, Dripta S. Raychaudhuri, Sujoy Paul, Samet Oymak, Amit K. Roy-Chowdhury
Meta Pseudo Labels Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le
Spatially-Varying Outdoor Lighting Estimation From Intrinsics Yongjie Zhu, Yinda Zhang, Si Li, Boxin Shi
Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization Long Zhao*, Yuxiao Wang, Jiaping Zhao, Liangzhe Yuan, Jennifer J. Sun, Florian Schroff, Hartwig Adam, Xi Peng, Dimitris Metaxas, Ting Liu
Benchmarking Representation Learning for Natural World Image Collections Grant Van Horn, Elijah Cole, Sara Beery, Kimberly Wilber, Serge Belongie, Oisin Mac Aodha
Scaling Local Self-Attention for Parameter Efficient Visual Backbones Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens
KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control Tomas Jakab*, Richard Tucker, Ameesh Makadia, Jiajun Wu, Noah Snavely, Angjoo Kanazawa
HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching Vladimir Tankovich, Christian Häne, Yinda Zhang, Adarsh Kowdle, Sean Fanello, Sofien Bouaziz
POSEFusion: Pose-Guided Selective Fusion for Single-View Human Volumetric Capture Zhe Li, Tao Yu, Zerong Zheng, Kaiwen Guo, Yebin Liu
Workshops (only Google affiliations are noted)
Media Forensics Organizers: Christoph Bregler
Safe Artificial Intelligence for Automated Driving Invited Speakers: Been Kim
VizWiz Grand Challenge Organizers: Meredith Morris
3D Vision and Robotics Invited Speaker: Andy Zeng
New Trends in Image Restoration and Enhancement Workshop and Challenges on Image and Video Processing Organizers: Ming-Hsuan Yang Program Committee: George Toderici, Ming-Hsuan Yang
2nd Workshop on Extreme Vision Modeling Invited Speakers: Quoc Le, Chen Sun
First International Workshop on Affective Understanding in Video Organizers: Gautam Prasad, Ting Liu
Adversarial Machine Learning in Real-World Computer Vision Systems and Online Challenges Program Committee: Nicholas Carlini, Nicolas Papernot
Ethical Considerations in Creative Applications of Computer Vision Invited Speaker: Alex Hanna Organizers: Negar Rostamzadeh, Emily Denton, Linda Petrini
Visual Question Answering Workshop Invited Speaker: Vittorio Ferrari
Sixth International Skin Imaging Collaboration (ISIC) Workshop on Skin Image Analysis Invited Speakers: Sandra Avila Organizers: Yuan Liu Steering Committee: Yuan Liu, Dale Webster
The 4th Workshop and Prize Challenge: Bridging the Gap between Computational Photography and Visual Recognition (UG2+) in Conjunction with IEEE CVPR 2021 Invited Speakers: Peyman Milanfar, Chelsea Finn
The 3rd CVPR Workshop on 3D Scene Understanding for Vision, Graphics, and Robotics Invited Speaker: Andrea Tagliasacchi
Robust Video Scene Understanding: Tracking and Video Segmentation Organizers: Jordi Pont-Tuset, Sergi Caelles, Jack Valmadre, Alex Bewley
4th Workshop and Challenge on Learned Image Compression Invited Speaker: Rianne van den Berg Organizers: George Toderici, Lucas Theis, Johannes Ballé, Eirikur Agustsson, Nick Johnston, Fabian Mentzer
The Third Workshop on Precognition: Seeing Through the Future Invited Speaker: Anelia AngelovaOrganizers: Utsav Prabhu Program Committee: Chen Sun, David Ross
Computational Cameras and Displays Organizers: Tali Dekel Keynote Talks: Paul Debevec Program Committee: Ayan Chakrabarti, Tali Dekel
2nd Embodied AI Workshop Organizing Committee: Anthony Francis Challenge Organizers: Peter Anderson, Anthony Francis, Alex Ku, Alexander Toshev Scientific Advisory Board: Alexander Toshev
Responsible Computer Vision Program Committee: Caroline Pantofaru, Utsav Prabhu, Susanna Ricco, Negar Rostamzadeh, Candice Schumann
Dynamic Neural Networks Meets Computer Vision Invited Speaker: Azalia Mirhoseini
Interactive Workshop on Bridging the Gap between Subjective and Computational Measurements of Machine Creativity Invited Speaker: David Bau
GAZE 2021: The 3rd International Workshop on Gaze Estimation and Prediction in the Wild Organizer: Thabo Beeler Program Committee: Thabo Beeler
Sight and Sound Organizers: William Freeman
Future of Computer Vision Datasets Invited Speaker: Emily Denton, Caroline Pantofaru
Open World Vision Invited Speakers: Rahul Sukthankar
The 3rd Workshop on Learning from Unlabeled Videos Organizers: Anelia Angelova, Honglak Lee Program Committee: AJ Piergiovanni
4th International Workshop on Visual Odometry and Computer Vision Applications Based on Location Clues — With a Focus on Mobile Platform Applications Organizers: Anelia Angelova
4th Workshop on Efficient Deep Learning for Computer Vision Invited Speaker: Andrew HowardOrganizers: Pete Warden, Andrew Howard
Second International Workshop on Large Scale Holistic Video Understanding Invited Speaker: Cordelia Schmid Program Committee: AJ Piergiovanni Organizers: David Ross
Neural Architecture Search 1st Lightweight NAS Challenge and Moving Beyond Invited Speakers: Sara Sabour
The Second Workshop on Fair, Data-Efficient, and Trusted Computer Vision Invited Speakers: Gaurav Aggarwal
The 17th Embedded Vision Workshop General Chair: Anelia Angelova
8th Workshop on Fine-Grained Visual Categorization Organizers: Christine Kaeser-Chen, Kimberly Wilber
AI for Content Creation Invited Speaker: Tali Dekel, Jon Barron, Emily Denton Organizers: Deqing Sun
Frontiers of Monocular 3D Perception Invited Speakers: Anelia Angelova, Cordelia Schmid, Noah Snavely
Beyond Fairness: Towards a Just, Equitable, and Accountable Computer Vision Organizers: Emily Denton
The 1st Workshop on Future Video Conferencing Invited Speakers: Chuo-Ling Chang, Sergi Caelles
Tutorials (only Google affiliations are noted)
Tutorial on Fairness Accountability Transparency and Ethics in Computer Vision Organizer: Emily Denton
Data-Efficient Learning in An Imperfect World Organizers: Boqing Gong, Ting Chen
Semantic Segmentation of Point Clouds: a Deep Learning Framework for Cultural Heritage Invited Speaker: Manzil Zaheer
From VQA to VLN: Recent Advances in Vision-and-Language Research Organizer: Peter Anderson
* Indicates work done while at Google