Google Brain Researchers Unveil ML Model that Manipulates Neural Networks

Pranav Dar Last Updated : 03 Jul, 2018
3 min read

Overview

  • Researchers at Google Brain, including Ian Goodfellow, have published research showing how to attack and manipulate deep neural networks
  • Their technique, called adversarial reprogramming, manipulates the entire ML model and performs a task chosen by the attacker
  • The results were tested on six models; their technique was able to easily change the output to produce a different result

 

Introduction

Picture this – you feed a picture of a cat to your computer vision algorithm and it misreads it as a bunch of squares, or in an even worse scenario, a dog. You made all the right tweaks to your algorithm so what went so wrong? Turns out it isn’t very difficult to manipulate computer vision techniques.

We have previously covered Google Brain’s research when they demonstrated how a CNN could be fooled into misreading the object in an image. And now Google Brain researchers have developed an even smarter technique, called adversarial reprogramming, that reprograms the entire machine learning model. This technique performs a task chosen by the attacker; it does not need the attacker to specify or compute what he wants to perform. This is what differentiates it from other research studies in this field.

The attack scenario proposed in this paper is as following (the image illustrating this follows):
  • An adversary gains access to the parameters of a neural network that is performing a specific task (ImageNet classification in this case)
  • It attempts to manipulate the function of the network in the form of transformations to input images
  • As the adversarial inputs are fed into the network, they re-purpose its learned features for a new task
The method was tested across six models and three experiments were performed:
  • In the first experiment, they managed to get all six algorithms to count the number of squares in an image rather than identify objects, by inserting manipulated input images from the MNIST computer vision dataset
  • In a second experiment, they forced the six models to classify the digits
  • In the third experiment, they had the models identifying images from CIFAR-10, an object recognition database, instead of the ImageNet corpus on which they were originally trained

You can read the research paper in full here.

 

Our take on this

This clearly illustrates the urgent need for a more robust deep learning security framework. It’s all well and good developing an awesome deep neural net but if it can be manipulated with ease, then you’re in big trouble. Kudos to the Google Brain team for continuously working on these scenarios and open sourcing their research.

The researchers mention that future studies will involve possible ways to defend against these kind of attacks. While we wait for that, I recommend reading up on information security. Meanwhile, we will also soon publish a blog post on using deep learning for shoring up against adversarial attacks so keep an eye out for that.

 

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

 

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Responses From Readers

Clear

Congratulations, You Did It!
Well Done on Completing Your Learning Journey. Stay curious and keep exploring!

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details