Researchers from Google, the Max Planck Institute of Informatics, and MIT CSAIL have released a new artificial intelligence technique. It allows users to manipulate images in seconds with just a click and drag. The new DragGAN, an open-source project, is an AI editing tool that leverages a pre-trained GAN (Generative Adversarial Network) to synthesize ideas that precisely follow user input while remaining on the generative image manifold of realistic images.
DragGAN is an interactive approach for intuitive point-based image editing that is far more powerful than Photoshop’s Warp tool. Unlike Photoshop, which merely smushes pixels around, DragGAN uses AI to regenerate the underlying object. With DragGAN, users can rotate images as if they were 3D, change the dimensions of cars, manipulate smiles into frowns, and adjust reflections on lakes. Moreover, they can change the direction someone faces.
DragGAN’s general framework, which does not rely on domain-specific modeling or auxiliary networks, sets it apart from other approaches. To achieve this, the researchers used an optimization of latent codes that incrementally moved multiple handle points toward their target points alongside a point-tracking procedure to trace the trajectory of the handle points faithfully. Both components use the discriminative quality of intermediate feature maps of the GAN to yield pixel-precise image deformations while maintaining the object’s rigidity and interactive performance.
This approach, often referred to as interactive point-based manipulation, allows DragGAN to generate realistic and accurate image transformations based on user input within the latent space of the generator. Such techniques highlight the power of machine learning in enabling advanced image manipulation capabilities.
According to the researchers, DragGAN by Google outperforms the state-of-the-art (SOTA) in GAN-based manipulation. Furthermore, it outputs new directions for powerful image editing using generative priors. They look to extend point-based editing to 3D generative models in the coming months.
This new technique shows that GAN models are more impactful than pretty pictures generated from diffusion models, such as those used in tools like DALLE.2, Stable Diffusion, and Midjourney. While there are apparent reasons why diffusion models are gaining popularity for image generation, GANs saw the same rage and sparked interest three years after Ian Goodfellow proposed them. GAN uses two neural networks—a generator and a discriminator—to generate new and synthesized data instances. The training of these networks relies heavily on high-quality datasets, showcasing the power of deep learning in creating realistic images.
When editing images of diverse subjects, users can “deform an image with precise control over where pixels go. Thus, they can manipulate the pose, shape, expression, and layout,” explains the researchers.
You can find the code for DragGAN AI here.
To utilize the DragGAN AI tool for photo editing, follow these straightforward steps:
Here are the features of the DragGAN AI image editing tool:
Also Read: How to Become an AI Video Editor?
Maximize your experience with DragGAN using these helpful tips:
This tool is a game-changing tool for editors! DragGAN simplifies the editing process while offering advanced features, revolutionizing the way you work. We hope this article provided valuable insights into the capabilities of DragGAN. Share your thoughts in the comment section below. Stay connected with us at Analytics Vidhya Blogs to stay updated on the latest advancements in the field of generative AI, including the generation of real images.
Ans. Yes, DragGAN is an open-source project available for free. Users can access the codebase on platforms like GitHub, enabling collaboration and customization. Additionally, DragGAN’s development was introduced in a recent research paper published by ACM, showcasing its capabilities and potential impact on image editing.
Ans. DragGAN is an AI editing tool developed by researchers from Google, the Max Planck Institute for Informatics, and MIT CSAIL. It leverages GAN inversion to enable users to manipulate images precisely. By utilizing a point tracking approach, DragGAN allows for accurate transformations while maintaining the object’s rigidity.
Ans. DragGAN, developed by researchers from Google, the Max Planck Institute for Informatics, and MIT CSAIL, offers a range of innovative features. Leveraging PyTorch for efficient implementation, DragGAN incorporates a point tracking approach for precise image manipulation. Users can access tutorials on the DragGAN AI tool’s official website, enabling them to maximize its potential. DragGAN is an open-source project available on GitHub that allows collaboration and customization. DragGAN maintains the object’s rigidity with its GAN inversion technique while enabling accurate transformations.
Ans. You can find comprehensive tutorials and resources for DragGAN on its official website. This is the link to the DragGAN AI tool. The research paper detailing DragGAN’s development and techniques can also be accessed through ACM. For those interested in contributing to the project or exploring its codebase, DragGAN’s repository is available on GitHub.