From Facebook’s research to DeepMind’s legendary algorithms, deep learning has climbed its way to the top of the data science world. It has led to amazing innovations, incredible breakthroughs, and we are only just getting started!
However if you are a newcomer to this field, the word “deep” might throw you into doubt. Deep learning is one of the hottest topics of this industry today, but it is unfortunately foreign and cryptic to most people. A lot of people carry an impression that deep learning involves a lot of mathematics and statistical knowledge.
If you had similar questions about deep learning, but were not sure how, when and where to ask them – you are at the right place. This article should answer most of what you would want to know.
By end of this article, we will dispel a few myths about deep learning and answer some widely asked questions about this field. We have also included plenty of resources to get you started.
Here is the exciting part – It isn’t as difficult as most people make it out to be. Read on to find more!
Deep Learning is nothing but a paradigm of machine learning which has shown incredible promise in the recent years. This is because of the fact that Deep Learning shows great analogy with the functioning of the human brain. The superiority of the human brain is an evident fact, and it is considered to be the most versatile and efficient self-learning model that has ever been created.
Let us understand the functioning of a deep learning model with an example:
What do you see in the above image?
The most obvious answer would be “a car”, right? Despite the fact, that there is sand, greenery, clouds and a lot of other things, our brain tags this image as one of a car. This is because our brain has learnt to identify the primary subject of an image.
This ability of deriving useful information from a lot of extraneous data is what makes deep learning special. With the amount of data that is being generated these days, we want our models to be better with more of this data being fed into it. While deep learning models get better with the increase in the amount of data.
Now although Deep Learning has been around for many years, the major breakthroughs from these techniques came just in the recent years. This is because of two main reasons – the first and foremost, as we saw before, is the increase of data generated through various sources. The infographic below succinctly visualizes this trend. The second is the growth in hardware resources required to run these models. GPUs, which are becoming a requirement to run deep learning models, are multiple times faster and they help us build bigger and deeper deep learning models in comparatively less time than we required previously.
This is the reason that Deep Learning has become a major buzz word in the data science industry.
Deep Learning has found many practical applications in the recent past. From Netflix’s famous movie recommendation system to Google’s self-driving cars, deep learning is already transforming a lot of businesses and is expected to bring about a revolution in almost all industries. Deep learning models are being used from diagnosing cancer to winning presidential elections, from creating art and writing literature to making real life money. Thus it would be wrong to say that it is just a hyped topic anymore.
Some major applications of deep learning that are being employed by technology companies are:
However, some people develop a thinking that deep learning is overhyped because of the fact that labeled data required for training deep learning models is not readily available. Even if the data is available, the computational power required to train such models does not come cheap. Hence, due to these barriers, people are not able to experience the power of deep learning and term it as just hype.
Go through the following blog to build some real life deep learning applications yourself:
This is one of the most important questions that most of us need to understand. The comparison can be done mainly on the below three verticals:
Data dependencies
The most important difference between deep learning and traditional machine learning is its performance as the scale of data increases. When the data is small, deep learning algorithms don’t perform that well. This is because deep learning algorithms need a large amount of data to understand it perfectly. On the other hand, traditional machine learning algorithms with their handcrafted rules prevail in this scenario. Below image summarizes this fact.
Feature engineering
Feature engineering is a process of putting domain knowledge into the creation of feature extractors to reduce the complexity of the data and make patterns more visible to learning algorithms to work. This process is difficult and expensive in terms of time and expertise.
In Machine learning, most of the applied features need to be identified by an expert and then hand-coded as per the domain and data type.
For example, features can be pixel values, shape, textures, position and orientation. The performance of most of the Machine Learning algorithm depends on how accurately the features are identified and extracted.
Deep learning algorithms try to learn high-level features from data. This is a very distinctive part of Deep Learning and a major step ahead of traditional Machine Learning. Therefore, deep learning reduces the task of developing new feature extractor for every problem. Like, Convolutional NN will try to learn low-level features such as edges and lines in early layers then parts of faces of people and then high-level representation of a face.
Interpretability
Last but not the least, we have interpretability as a factor for comparison of machine learning and deep learning. This factor is the main reason deep learning is still thought 10 times before its use in industry.
Let’s take an example. Suppose we use deep learning to give automated scoring to essays. The performance it gives in scoring is quite excellent and is near human performance. But there’s is an issue. It does not reveal why it has given that score. Indeed mathematically you can find out which nodes of a deep neural network were activated, but we don’t know what there neurons were supposed to model and what these layers of neurons were doing collectively. So we fail to interpret the results.
On the other hand, machine learning algorithms like decision trees give us crisp rules as to why it chose what it chose, so it is particularly easy to interpret the reasoning behind it. Therefore, algorithms like decision trees and linear/logistic regression are primarily used in industry for interpretability.
If you would like to learn about a more in-depth comparison between machine learning and deep learning, I recommend you go through the following blog:
Starting out in deep learning is not as difficult as people might make you believe. There are a few elementary basics that you should cover before diving into deep learning. Deep learning requires knowledge of the following topics:
For a more detailed understanding about the prerequisites please follow:
No, a PhD is not a mandatory requirement to make a career in deep learning. You can learn, experiment, and build up your work experience portfolio without going to university. The emphasis for any job or role is usually on demonstrating your competence, and not on your degree, per se.
Having said that, a PhD in a specific field (like linguistics for NLP) will definitely accelerate your path if you choose to combine that with deep learning.
I would recommend you use Python, because of its robust ecosystem for machine learning. The python ecosystem comprises of developers and coders who are providing open source libraries and support for the community of python users. This makes the task of writing complex codes for various algorithms much easier and the techniques easier to implement and experiment with.
Also, Python being a more generalized programming language, can be used for both the development and implementation. This greatly simplifies the transition from development to operations. That is, a deep learning product that can predict the price of flight tickets, can not only be developed in python but can also be attached with your website in the same form. This is what makes Python a universal language.
Besides this, I would suggest that beginner’s use high level libraries like Keras. This makes experimentation easier by providing abstraction to the unnecessary information that is hidden under the algorithms. And giving access to the parameters that can be tweaked to enhance the performance of such models. Let us understand this with an example:
When you press the buttons on a television remote, do you need to care about the background processes that are happening inside the remote? Do you need to know about what signal is being sent out for that key, or how is it being amplified?
No, right?
Because maybe an understanding of these processes is required for a physicist but for a lame man sitting in his bedroom, it is just an information overload.
There are also other contenders apart from Python in the deep learning space such as R, Julia, C++, and Java. For alternatives of libraries, you can check out TensorFlow, Pytorch, Caffe2, DL4J, etc. We should stay updated with their developments as well.
If you are not well versed with programming, there are also a few GUI based softwares, that require no coding, to build deep learning models, such as Lobe or Google’s AutoML, among others.
When you train a deep learning model, two main operations are performed:
In forward pass, input is passed through the neural network and after processing the input, an output is generated. Whereas in backward pass, we update the weights of neural network on the basis of error we get in forward pass.
Both of these operations are essentially matrix multiplications. A simple matrix multiplication can be represented by the image below
Here, we can see that each element in one row of first array is multiplied with one column of second array. So in a neural network, we can consider first array as input to the neural network, and the second array can be considered as weights of the network.
This seems to be a simple task. Now just to give you a sense of what kind of scale deep learning – VGG16 (a convolutional neural network of 16 hidden layers which is frequently used in deep learning applications) has ~140 million parameters; aka weights and biases. Now think of all the matrix multiplications you would have to do to pass just one input to this network! It would take years to train this kind of systems if we take traditional approaches.
We saw that the computationally intensive part of neural network is made up of multiple matrix multiplications. So how can we make it faster?
We can simply do this by performing all the operations at the same time instead of doing it one after the other. This, in a nutshell, is why we use GPU (graphics processing units) instead of a CPU (central processing unit) for training a neural network.
Deep Learning have been in the spotlight for quite some time now. Its “deeper” versions are making tremendous breakthroughs in many fields such as image recognition, speech and natural language processing etc.
Now that we know it is so impactful; the main question that arises is when to and when not to apply neural networks? This field is like a gold mine right now, with many discoveries uncovered everyday. And to be a part of this “gold rush”, you have to keep a few things in mind:
It is true that we need a large amount of data to train a typical deep learning model. But we can generally overcome this by using something called transfer learning. Let me explain thoroughly.
One of the barrier for using deep learning models for industry applications is where the data is not in huge amount. A few examples of data needed to train some of the popular deep learning models are:
Google’s Neural Machine Translation | VGG Network | DeepVideo | |
Objective | Text Translation | Image Category Classification | Video Category Classification |
Data Size | 6M pairs of English-French sentences | 1.2M images with labeled categories | 1.1M videos with labeled categories |
Parameters | 380M | 140M | About 100M |
However, a deep learning model trained on a specific task can be reused for different problem in the same domain even if the amount of data is not that huge. This technique is known as Transfer Learning.
For instance, we have a set of 1000 images of cats and dogs labeled as 1 and 0 (1 for cat and 0 for dog) and we have another set of 500 test images that we need to classify. So, instead of training a deep learning model on the data of 1000 images, we can use a pre-trained VGGNet model and retrain it on our data and use it to classify the unlabeled set of images. A pre-trained model may not be 100% accurate in your application, but it saves huge efforts required to reinvent the wheel.
You may have a look at this article to get a better intuition of using a pre-trained model.
To practice deep learning, ideas alone will not help. We also need labeled data to test our ideas using deep learning.
You can also refer this list of exciting deep learning datasets and problems.
Being a comparatively newer technology, there is not enough content and tutorials available for the beginners. However, free-quality content and resources related to deep learning are steadily increasing. The learning resources can be classified on the different applications of deep learning.
Besides this, you can also go through the following blogs for a more extensive list of resources:
Some common questions that may be asked on deep learning are:
Do note that this is not an exhaustive list that will make you completely ready for an interview. You can go through the following skill test to test yourself on important questions on deep learning.
Deep learning has come a long way in recent years, but still has a lot of untapped potential. We are still in the nascent stages of this field, with new breakthroughs happening seemingly every day. One of the use-cases that we can definitely see in the suture is of automobile industry, where Deep Learning can revolutionize it by making self-driving cars a reality. While we don’t have a crystal ball to predict the future, we can see deep learning models requiring less and less involvement from human data scientists and researchers.
In the immediate future, we can definitely see a trend where the knowledge of deep learning will be a skill required by every Data Science practitioner. In fact, you must have caught sight of a job position spurn out recently, called a “Deep Learning Engineer”. This person is responsible to deploy and maintain Deep Learning models used by various departments of that company. Needless to say, there will be a huge demand of such people in the industry.
Currently, one of the limitations of DL is that it does what a human asks of it. It requires tons of data to learn it’s target objective, and replicates that. This has induced bias in certain applications. We can see this improving over time such that the bias is eliminated in the training process.
We might even stop differentiating deep learning from the other types of learning, with time. It is primed to become a popular and commonly used field and will not require special branding efforts to market or sell.
There are a lot of cases still where researchers, after training a DL model, are unable to explain the ‘why’ behind it. “It’s producing great results but why did you tune a hyperparameter a certain way?” Hopefully with the rapid advancement in DL, we will see this black box concept becoming history, and we can explain the intuition behind the decision it takes.
These are broadly the answers to the most frequently asked questions on our portal or elsewhere by the people who want to jump onto this Deep Learning bandwagon.
Do you have any other questions on deep learning that you need clarification on? Use the comments section below to ask; or hop onto our Discussion portal, and we will help you out!
Hi Kunal, Great Article and very helpful.. As a Deep Learning Beginner, Do we have to focus on learning Tensorflow or any other Deep Learning library or Can we learn the concepts and make use of any Deep Learning Platform like Watson Studio or H20.ai to create Deep Learning models?