A Short Map of Machine Learning: From Algorithms to Deep Learning
"AI," "machine learning," and "deep learning" get used as if they were synonyms. They are nested, not equal — and the story of how the inner circle (deep learning) came to dominate the outer ones is also the story of why the models in our ASR series look the way they do. Here is the short map.
Three rings
Artificial intelligence is the broad goal: machines doing things that seem to require intelligence. Machine learning is the subset that gets there by learning patterns from data rather than being programmed rule by rule. Deep learning is a subset of that again — machine learning with deep neural networks. Each ring sits inside the last; most of what people now call "AI" is, specifically, the innermost ring.
The classical toolbox
Before deep learning took over, machine learning was a kit of distinct algorithms, sorted by what they need:
Supervised methods learn from labelled examples. Regression predicts numbers (linear regression for house prices) or classes (logistic regression for "spam or not"). Support vector machines draw the cleanest possible boundary between classes, and with the "kernel trick" can bend that boundary into complex shapes. Neural networks round out the supervised set. Unsupervised methods work without labels: clustering (K-means) groups similar points; dimensionality reduction (PCA) squeezes many features into a few while keeping most of the information. There are in-between cases too — recommendation systems and collaborative filtering being the famous one.
What a neural network actually is
The idea is older than the hype and simpler than it sounds. A neuron takes several inputs, multiplies each by a weight, adds a bias, sums them, and passes the result through a non-linear "activation" function. That's it. Wire many neurons together — each one's output feeding the next layer — and you have a network. Make it deep (two hidden layers at a minimum, often hundreds) and, trained by gradient descent, it can learn astonishingly intricate functions. The neurons stay trivial; the power is in their number and their connections.
The twist: neural nets lost, then won
Here is the part that rhymes with every other evolution story on this blog. Neural networks were the exciting idea of the 1980s — backpropagation, Geoffrey Hinton, early speech and vision wins — then they faded. Through the 1990s and 2000s, support vector machines largely displaced them: better theory, cleaner math, strong results without needing mountains of data or compute. Then, around 2012, the balance tipped back. Enough data and enough GPU power finally let deep networks outperform everything else, and "deep learning" became the name of the comeback. The algorithm was not new; the conditions were.
Why the map still matters
That resurgence is the hinge the rest of the story turns on. The deep networks that won in 2012 are the direct ancestors of the Transformers behind modern speech recognition, and the same forces — more data, more compute, better training — are what eventually shrank those models enough to run on your own machine. Knowing where deep learning sits on the map is knowing why the last fifteen years of AI went the way they did.
Written originally as part of a 2017 machine-learning primer, when the deep-learning resurgence was still fresh. Part of our ML foundations notes.
