[ Neural Networks for Machine Learning by Geoffrey Hinton ]

 CompletedWhy do we need machine learning? [13 min]
 CompletedWhat are neural networks? [8 min]
 CompletedSome simple models of neurons [8 min]
 CompletedA simple example of learning [6 min]
 CompletedThree types of learning [8 min]
 CompletedTypes of neural network architectures [7 min]
 CompletedPerceptrons: The first generation of neural networks [8 min]
 CompletedA geometrical view of perceptrons [6 min]
 CompletedWhy the learning works [5 min]
 CompletedWhat perceptrons can’t do [15 min]
 Learning the weights of a linear neuron [12 min]
 The error surface for a linear neuron [5 min]
 Learning the weights of a logistic output neuron [4 min]
 The backpropagation algorithm [12 min]
Learning representations by backpropagating errors for The backpropagation algorithm [12 min]
 Using the derivatives computed by backpropagation [10 min]
 Learning to predict the next word [13 min]
Slides for Learning to predict the next word [13 min]
 A brief diversion into cognitive science [4 min]
 Another diversion: The softmax output function [7 min]
 Neuroprobabilistic language models [8 min]
Neural probabilisic language models for Neuroprobabilistic language models [8 min]
 Ways to deal with the large number of possible outputs [15 min]
 Why object recognition is difficult [5 min]
Lecture 5 slides in pptx for Why object recognition is difficult [5 min]
 Achieving viewpoint invariance [6 min]
 Convolutional nets for digit recognition [16 min]
 Convolutional nets for object recognition [17min]
(hard) Gradientbased learning applied to document recognition for Convolutional nets for object recognition [17min]
Convolutional networks for images, speech, and time series for Convolutional nets for object recognition [17min]
 Overview of minibatch gradient descent
 A bag of tricks for minibatch gradient descent
 The momentum method
 Adaptive learning rates for each connection
 Rmsprop: Divide the gradient by a running average of its recent magnitude
 Modeling sequences: A brief overview
Lecture 7 slides in pptx for Modeling sequences: A brief overview
 Training RNNs with back propagation
 A toy example of training an RNN
 Why it is difficult to train an RNN
 Longterm Shorttermmemory
(hard) A novel approach to online handwriting recognition based on bidirectional long shortterm memory networks for Longterm Shorttermmemory
 A brief overview of Hessian Free optimization
Lecture 8 slides in pptx for A brief overview of Hessian Free optimization
 Modeling character strings with multiplicative connections [14 mins]
 Learning to predict the next character using HF [12 mins]
Generating Text with Recurrent Neural Networks for Learning to predict the next character using HF [12 mins]
 Echo State Networks [9 min]
 Overview of ways to improve generalization [12 min]
Lecture 9 slides in pptx for Overview of ways to improve generalization [12 min]
 Limiting the size of the weights [6 min]
 Using noise as a regularizer [7 min]
 Introduction to the full Bayesian approach [12 min]
 The Bayesian interpretation of weight decay [11 min]
 MacKay’s quick and dirty method of setting weight costs [4 min]
 Why it helps to combine models [13 min]
lecture 10 slides in pptx for Why it helps to combine models [13 min]
 Mixtures of Experts [13 min]
Adaptive mixtures of local experts for Mixtures of Experts [13 min]
 The idea of full Bayesian learning [7 min]
Subtitles (text) for The idea of full Bayesian learning [7 min]Subtitles (srt) for The idea of full Bayesian learning [7 min]Video (MP4) for The idea of full Bayesian learning [7 min]
 Making full Bayesian learning practical [7 min]
 Dropout [9 min]
Improving neural networks by preventing coadaptation of feature detectors for Dropout [9 min]
 Hopfield Nets [13 min]
lecture 11 slides in pptx for Hopfield Nets [13 min]
 Dealing with spurious minima [11 min]
 Hopfield nets with hidden units [10 min]
 Using stochastic units to improv search [11 min]
 How a Boltzmann machine models data [12 min]
Scholarpedia: Boltzmann Machines for How a Boltzmann machine models data [12 min]
 Boltzmann machine learning [12 min]
lecture 12 slides in pptx for Boltzmann machine learning [12 min]
 OPTIONAL VIDEO: More efficient ways to get the statistics [15 mins]
 Restricted Boltzmann Machines [11 min]
 An example of RBM learning [7 mins]
 RBMs for collaborative filtering [8 mins]
 The ups and downs of back propagation [10 min]
lecture 13 slides in pptx for The ups and downs of back propagation [10 min]
 Belief Nets [13 min]
 Learning sigmoid belief nets [12 min]
Connectionist learning of belief networks for Learning sigmoid belief nets [12 min]
 The wakesleep algorithm [13 min]
The "wakesleep" algorithm for unsupervised neural networks for The wakesleep algorithm [13 min]
 Learning layers of features by stacking RBMs [17 min]
Selftaught learning: transfer learning from unlabeled data for Learning layers of features by stacking RBMs [17 min]
(easy) To recognize shapes, first learn to generate images for Learning layers of features by stacking RBMs [17 min]
(hard) A fast learning algorithm for deep belief nets for Learning layers of features by stacking RBMs [17 min]
lecture 14 slides in pptx for Learning layers of features by stacking RBMs [17 min]
 Discriminative learning for DBNs [9 mins]
 What happens during discriminative finetuning? [8 mins]
 Modeling realvalued data with an RBM [10 mins]
 OPTIONAL VIDEO: RBMs are infinite sigmoid belief nets [17 mins]
 From PCA to autoencoders [5 mins]
lecture 15 slides in pptx for From PCA to autoencoders [5 mins]
 Deep auto encoders [4 mins]
 Deep auto encoders for document retrieval [8 mins]
 Semantic Hashing [9 mins]
Semantic Hashing for Semantic Hashing [9 mins]
 Learning binary codes for image retrieval [9 mins]
Using Very Deep Autoencoders for ContentBased Image Retrieval for Learning binary codes for image retrieval [9 mins]
 Shallow autoencoders for pretraining [7 mins]
 OPTIONAL: Learning a joint model of images and captions [10 min]
lecture 16 slides in pptx for OPTIONAL: Learning a joint model of images and captions [10 min]
 OPTIONAL: Hierarchical Coordinate Frames [10 mins]
 OPTIONAL: Bayesian optimization of hyperparameters [13 min]
 OPTIONAL: The fog of progress [3 min]
Lecture 2a – An overview of the main types of neural network architecture
Lecture 2b – Perceptrons: The first generation of neural networks
Lecture 2c – A geometrical view of perceptrons
Lecture 2d – Why the learning works
Lecture 2e – What perceptrons can’t do
Lecture 4c – A quick note on the crossentropy and derivative of a softmax unit
Lecture 13b – The math of Sigmoid Belief Networks
Online Books
Local Links
External resources
 Neuralnets:External Resources for Neural Networks for Machine Learning
