This article was published as a part of theΒ Data Science Blogathon.
As the craze for deep learning continues to grow among students and industries due to its remarkable results and significant impact on society and businesses, it is important to delve into the history of neural networks to truly appreciate their significance. By understanding their roots and the reasons behind their development, we can gain a better understanding of their purpose. In this exploration, we will cover key historical events, focusing on important milestones rather than overwhelming details, to keep the discussion concise and engaging.
Let’s start from the very beginning when the idea first came into being. You might be thinking that the Deep Learning technique has recently flourished so it would have begun some 20-30 years ago maybe, but let me tell you that it all started about 78 years ago. Yes, you read that right, the history of Deep Learning is often traced back to 1943 when Walter Pitts and Warren McCulloch created a computer model that supported the neural networks of the human brain. They used a mixture of algorithms and arithmetic they called βthreshold logicβ to mimic the thought process.
Since that point, Deep Learning has evolved steadily, with only two significant breaks in its development. Both were tied to the infamous Artificial Intelligence Winters.
During the Cold War when the American scientists were trying to translate Russian to English and a lot of research was done on intelligent machines by some of the greatest mathematicians like Alan Turing (often known as the Father of Modern Computing) who created the Turing Test for the testing the intelligence of a machine. Frank Rosenblatt, a mathematician came up with the very first neural network-based model called Perceptron in the year 1958. This is similar to the machine learning model Logistic Regression with a slightly different loss function.
In exploring the history of neural networks, it becomes evident that nature has always been a profound source of inspiration. This holds true for deep learning, as it draws heavily from the biological workings of our brains. Initially, there was a rudimentary understanding of neurons and their functioning. Thus, let me begin by introducing the biological neuron, which served as a pivotal concept in the development of neural networks.
If we touch the surface level of a biological neuron then it consists of mainly 3 parts, Nucleus, Dendrites, and Axons. The electrical signals/impulses are received by these dendrites connected to the nucleus where some processing is done by the nucleus itself and finally it sends out a message in the form of the electrical signal to the rest of the connected neurons through axons. This is the simplest explanation of the working of a biological neuron, the people studying biology would be aware of how massively complex structure it is and exactly how it works.
So those mathematicians and scientists came up with the way to represent this biological neuron mathematically where there are n inputs to a body and each having some weights since all the inputs may not be equally important to produce the output. This output is nothing but applying a function after taking the sum of the products of these inputs and their respective weights. Since this idea of the perceptron is far from the complex reality of a biological neuron, we can say that it is loosely inspired by biology.
Now came the era when people asked why we canβt create a network of connected neurons which is again inspired by the biological brain of living creatures like human beings, monkeys, ants, etc. basically having a structure of interconnected neurons. A lot of attempts were made from the year 1960s, but this was made successful in a seminal paper in 1986 by a group of mathematicians, one of which was Geoffrey Hinton (he has phenomenal contributions in the field of machine learning and AI).
So they came up with the idea of the Backpropagation algorithm. In a nutshell, we can remember this algorithm as a chain rule of differentiation. This not only made the training of Artificial Neural Network possible but also created an AI Hype where people talked about it all day and thought that in the coming 10 years it would be possible for a machine to think like a human.
Even though it created such hype, it got washed away in the 1990s and this came period came to be known as the AI Winter because people hyped so much about it but the actual effect was marginal at that time. What do you think could be the reason for it? Before I disclose to you the reason behind it, I would like you to give it a shot.
Think…
Think…
Okay, here you go.
Photo by Lorenzo Herrera on Unsplash
Even though the mathematicians came up with this beautiful algorithm of Backpropagation, due to the lack of computational power in the 1990s and the lack of data, this hype eventually died after the Department of Defense US stopped the funding for AI seeing the marginal impact over the years after being hyped so much. So the machine learning algorithms like SVM, Random Forest, and GBDT evolved and became extremely popular from 1995 to 2009.
While everybody moved to the algorithms like SVM and all, Geoffrey Hinton still believed that true intelligence would be achieved only through Neural Networks. So for almost 20 years i.e. from 1986 to 2006, he worked on neural networks. And in 2006 he came up with a phenomenal paper on training a deep neural network. This is the beginning of the era known as Deep Learning. This paper by Geoffrey Hinton did not receive much popularity until 2012.
You might wonder what is it that made deep neural networks extremely popular in 2012. So in 2012, Stanford conducted a competition called ImageNet, one of the hardest problems back then consist of millions of images and the task was to identify the objects from the given image. I would like you to recall that in 2012, people had an enormous amount of data, and also the computation was very powerful compared to what was present in the 1980s. The deep neural network or Deep Learning for that matter outperformed every machine learning algorithm in this competition.
This was the moment when the big tech giants like Google, Microsoft, Facebook, and others started seeing the potential in Deep Learning and started investing heavily in this technology.
Today if I talk about the use cases of Deep Learning then you might know some of the popular voice assistants like Google Assistant, Siri, Alexa are all powered by deep learning. Also, Teslaβs self-driving cars are possible because of advances in deep learning. Apart from this it also has its applications in the healthcare sector. I strongly believe there is still a lot of potential in Deep Learning which we would experience in the coming years.
A. The concept of neural networks dates back to the 1940s, and the first artificial neural network model was developed by Warren McCulloch and Walter Pitts in 1943. Their work, “A Logical Calculus of Ideas Immanent in Nervous Activity,” presented a mathematical model of an artificial neuron, inspired by the biological neurons in the brain. While their model was a significant contribution to the field, it was a simplified representation and not a full-fledged practical implementation of a neural network.
A. The concept of neural networks was introduced in the 1940s. The first artificial neural network model was developed by Warren McCulloch and Walter Pitts in 1943. Their work laid the foundation for understanding how neural networks could mimic the behavior of biological neurons. Since then, the field of neural networks has evolved significantly, with various advancements and breakthroughs in understanding, algorithms, and applications.
Artificial neural networks started in the 1940s, inspired by the human brain. They got a big boost in the 1980s with something called backpropagation. Since then, they’ve gotten better thanks to improved algorithms, more powerful computers, and more data. Now, they’re used in lots of things like machine learning and deep learning.
Neural networks keep getting bigger, making it hard to pinpoint the absolute “largest.” The most massive ones today have billions or even trillions of parameters. They’re super-sized for tasks like understanding language and recognizing images, showing how powerful and flexible neural networks have become.
The media shown in this article are not owned by Analytics Vidhya and is used at the Authorβs discretion.