The Conference on Computer Vision and Pattern Recognition (CVPR) is undeniably the leading annual event in its field. As expected, CVPR 2024, held from June 17th to 21st at the Seattle Convention Center, USA, proved to be a resounding success. This year’s conference witnessed a record-breaking number of submissions – a staggering 11,532, reflecting a 26% increase over 2023. With only 2,719 papers were accepted (representing a highly competitive 23.6% acceptance rate), CVPR 2024 solidified its position as the premier platform for showcasing groundbreaking research in computer vision.
Jointly sponsored by the IEEE Computer Society and the Computer Vision Foundation, CVPR continues to be the go-to event for the global computer vision community. With over 12,000 attendees, the conference fostered a vibrant atmosphere for networking, inspiration, and the exchange of cutting-edge ideas.
The 2024 CVPR Conference featured three compelling keynote presentations, each exploring cutting-edge topics at the intersection of computer vision, artificial intelligence, and interdisciplinary research:
Professor Bongard discussed the limitations of current AI systems in handling adversarial attacks, emphasizing the importance of embodiment in AI. He argued that true embodied AI goes beyond simply placing deep learning systems in robots, instead focusing on the concept of change and adaptation. Bongard presented his work on soft and biological robots capable of “morphological pre-training,” suggesting potential avenues for the CVPR community to contribute to this emerging field.
Dr. Baker presented recent advances in protein design, particularly focusing on the use of deep learning methods to create novel proteins from scratch. His talk highlighted the shift from modifying existing proteins to designing entirely new ones optimized for specific functions. Baker discussed how his team develops and applies deep learning techniques to predict amino acid sequences that will fold into desired structures, followed by experimental validation of these computationally designed proteins.
In this unique keynote, artist Sofia Crespo shared insights into her creative practice, which leverages generative systems and neural networks to explore speculative lifeforms. Crespo’s work investigates the intersection of biology-inspired technologies and artificial mechanisms, challenging our understanding of creativity and the role of AI in artistic expression. Her presentation offered a thought-provoking perspective on the potential of AI to reshape our conceptions of biodiversity and organic life.
The CVPR 2024 Awards Committee selected 10 outstanding papers out of 2,719 accepted papers for recognition, doubling the number of awards from the previous year.
The Best Papers category featured two groundbreaking studies:
This paper introduces a novel approach to modeling natural oscillation dynamics from a single still image. The technique produces photo-realistic animations and outperforms existing methods, showing potential for various applications such as creating seamless looping or interactive image dynamics.
Click here to read this paper.
This research presents the first rich human feedback dataset for image generation. The team developed and trained a multimodal Transformer to predict human feedback, demonstrating improvements in image generation techniques.
Click here to read this paper.
Two papers received honorable mentions in this category:
The conference also recognized outstanding work by student researchers:
This paper introduces Mip-Splatting, an improvement on 3D Gaussian Splatting that allows for alias-free rendering at any scale. The technique shows superior performance in out-of-distribution scenarios.
Click here to read this paper.
The researchers present TREEOFLIFE-10M, a large-scale diverse biology image dataset, and BIOCLIP, a foundation model for the tree of life. BIOCLIP demonstrates strong performance as a fine-grained classifier for biology in both zero- and few-shot settings.
Click here to read this paper.
Four additional papers received honorable mentions in the student category:
The IEEE Computer Society – Technical Community on Pattern Analysis and Machine Intelligence (TCPAMI) Awards are prestigious honors given out at CVPR to recognize outstanding contributions to the field of computer vision. Here are the TCPAMI Awards receipts of 2024:
This is awarded to a paper from 10 years ago (2014 in this case) that has had a major impact on the field of computer vision.
The 2024 Longuet-Higgins Prize has been awarded to “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation” by Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. This paper, published at CVPR 2014, introduced R-CNN, a landmark method for object detection that used deep learning and region proposals, significantly advancing the state-of-the-art at the time
This honors early-career researchers who have already made distinguished contributions to computer vision.
The 2024 Young Researcher Award was presented to Angjoo Kanazawa and Carl Vondrick. Kanazawa is recognized for her work on 3D human pose and shape estimation from single images, while Vondrick is honored for his research on video understanding and generation
This award was started in 2020 to recognize long-standing service, research, and mentoring in computer vision. This is named after Thomas Huang, a pioneering figure in computer vision, pattern recognition and human-computer interaction.
The 2024 recipient is Andrea Vedaldi, a professor at the University of Oxford. Vedaldi is honored for his influential work on visual recognition and his leadership in the computer vision community
CVPR 2024 Program Co-Chair David Crandall emphasized that the doubling of awarded papers this year reflects the continued growth of CVPR and the field of computer vision. Walter J. Scheirer, CVPR 2024 General Chair, highlighted the lasting impact of CVPR research and researchers, as demonstrated by the TCPAMI Awards.
The top tech companies had a strong presence at CVPR 2024, showcasing their latest advancements in computer vision research. Apple presented papers on efficient image-text models for mobile devices, like “MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training“. NVIDIA contributed innovative work on generative image dynamics and a novel 3D reconstruction method using Gaussian splats. Google also had papers accepted on generative image dynamics and also the winner paper on incorporating rich human feedback into text-to-image generation.
You can read more about the papers submitted by tech giants below:
CVPR 2024 showcased amazing advances in computer vision. From protein design to AI-made creatures, the future looks incredible. But a key question remains: how will humans interact with these powerful machines? Talks on embodied AI and artistic AI lifeforms urge us to consider the ethical and philosophical implications. As machines become more sophisticated, CVPR reminds us that the most important conversation is how we, as humans, will choose to collaborate with them.
A. The CVPR 2024 took place at the Seattle Convention Center.
A. CVPR 2024 received 11,532 valid paper submissions and accepted only 2,719, resulting in an acceptance rate of about 23.6%.
A: Yes, the Conference on Computer Vision and Pattern Recognition (CVPR) is regarded as one of the most important conferences in the field of computer vision and pattern recognition.