E9:205 Machine Learning for Signal Processing

Announcements       Syllabus       Grading       Textbooks       Slides      



When MW 3:30 - 5:00 pm
Where EE B303 (Second class onwards)
Who Sriram Ganapathy
Office C 334 (2nd Floor)
Email sriram aT ee doT iisc doT ernet doT in
Teaching Assistant Aravind Illa
Lab C 326 (2nd Floor)
Email aravindece77 aT gmail doT com

Announcements

Top      

Syllabus

  • Introduction to real world signals - text, speech, image, video.
  • Feature extraction and front-end signal processing - information rich representations, robustness to noise and artifacts, signal enhancement, bio inspired feature extraction.
  • Basics of pattern recognition, Generative modeling - Gaussian and mixture Gaussian models, hidden Markov models, factor analysis.
  • Discriminative modeling - support vector machines, neural networks and back propagation.
  • Introduction to deep learning - convolutional and recurrent networks, pre-training and practical considerations in deep learning, understanding deep networks.
  • Deep generative models - Autoencoders, Boltzmann machines, Adverserial Networks.
  • Applications in computer vision and speech recognition.
Top      

Grading Details

Assignments 15%
Midterm exam. 20%
Final exam. 35%
Project 30%

Pre-requisites

  • Random Process/Probablity and Statistics
  • Linear Algebra/Matrix Theory
  • Basic Digital Signal Processing/Signals and Systems
Top      

Textbooks

References

  • “Deep Learning : Methods and Applications”, Li Deng, Microsoft Technical Report.
  • “Automatic Speech Recognition - Deep learning approach” - D. Yu, L. Deng, Springer, 2014.
  • “Machine Learning for Audio, Image and Video Analysis”, F. Camastra, Vinciarelli, Springer, 2007. pdf
Top      

Slides





14-08-2017 Introduction to real world signals - text, speech, image, video. Learning as a pattern recognition problem. Examples. Roadmap of the course.

slides
16-08-2017 Feature Extraction - Goals and challenges. Introduction to text processing. Bag of words model. Term Frequency- Inverse document frequency. N-gram modeling. Feature Extraction in Audio and Speech - Spectrogram.

slides
21-08-2017 Melfrequency cepstral Coefficients (MFCC), Linear Prediction - orthogonality of prediction error with past samples, optimal linear predictor, stability of prediction filter, Autoregressive process, linear prediction for AR process

slides
23-08-2017 Basics for Digital Image Processing – Filtering, Smoothing, Edge Detection, Scale Invariant Feature Transform (SIFT).

slides
28-08-2017 Matrix and vector derivatives - definition and properties. Dimensionality reduction - Preserving maximum data variance - principal component analysis (PCA). Minimum error formulation of PCA. Residual error in PCA. Example of PCA application for hand-written digit images.
PRML - Bishop (Appendix, Chapter 12)

slides
30-08-2017 PCA for high dimensional data. Whitening and KL transform. Limitations of PCA. Class dependent dimensionality reduction using linear discriminant analysis (LDA). Fisher discriminant for 2 class case using within-class and between class matrices. Solution of LDA. Multi-class LDA, PCA versus LDA example.
PRML - Bishop (Chapter 4.1.4)

slides
01-09-2017 Basics of Python Programming. Installing python, simple commands and functions. Loading speech and image data. Vectorizing, mean computation and spectogram.

slides

code
01-09-2017 Assignment #1. Due on 11-09-2017. Analytical part submitted in class. Coding part submitted via e9205mlsp2017 aT gmail doT com.

HW1
image data
speech data
04-09-2017 Decision theory basics. Minimum classification error rule. MAP and ML based approaches. 3 approaches to ML. Generative versus discriminative modeling. Introduction to generative modeling. Multi-variate Gaussian Distribution.
PRML - Bishop (Chapter 1.5)

slides
06-09-2017 MLE for multi-variate Gaussian. Sample mean and variance. Limitations of Gaussian modeling. Need for mixture modeling. Probability density of Gaussian Mixture Model (GMM).

slides
Future Reading
11-09-2017 MLE for GMM - Expectation Maximization (EM) algorithm. Proof of EM algorithm. Convergence properties. EM algorithm for GMM parameter estimation. Choice of hidden variable.
Ref - Tutorial GMMs
Proof of EM algorithm
EM algorithm for GMMs

slides
13-09-2017 Summary of GMM modeling. Application of GMM for unsupervised clustering.

slides
18-09-2017 Limitations of GMM modeling for sequence data. Markov Chains. Hidden Markov Model (HMM) definition. Three Problems in HMM.
"Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6)

20-09-2017 Evaluating the likelihood using HMM (Problem 1), Complexity reduction using forward variable and backward variable. Finding the best state sequence (Problem 2) - instantaneous probabilility based, Viterbi algorithm for state sequence segmentation.
"Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6)

Rabiner Tutorial on HMM
23-09-2017 Re-estimating the HMM parameters - EM algorithm for HMM (Problem 3). Q function definition and solution. Intuitions about HMM training.
"Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6)
EM algorithm for HMMs

Rabiner Tutorial on HMM
25-09-2016 Non-negative matrix factorization (NMF), problem definition, cost function and constraints. auxiliary function, proof of convergence, parameter update rule. Application to audio source separation and speech denoising.
Refs - Bhiksha Raj-Tutorial     Lee-Paper

slides
04-10-2017 First Mid-term Exam

09-10-2017 Application of NMF. Audio separation into individual instruments, speech denoising with known and unknown sources. Linear models for regression - problem definition. Least squares regression. Maximum likelihood and least squares regression.
PRML - Bishop (Chapter 2)

11-10-2017 Overfitting and Underfitting. Regularized least squares. Linear Models for Classification. Least squares for classification. Sigmoid function and one-of-K encoding. Problems with least squares classification.
PRML - Bishop (Chapter 3)

slides
16-10-2017 Logistic regression - two class problem. Sigmoid function and posterior probability. Logistic regression - K class problem. Softmax function and cross entropy error function and Maximum likelihood estimation. Linear regression revisited - dual formulation. PRML - Bishop (Chapter 4,6)

21-10-2017 Design matrix, kernel function and Gram matrix. Neccessary and sufficient condition for kernel functions (Mercer's theorem), Examples of kernel functions. PRML - Bishop (Chapter 6)

23-10-2017 Margin of linear classifier. Maximum margin classifier formulation. Constraints involved in optimization. Introduction to support vector machines PRML - Bishop (Chapter 7)

25-10-2017 Introduction to constrained optimization. Primal and dual problems. Weak and strong duality. Neccessary and sufficient conditions for strong duality for convex problems with convex conditions. KKT conditions. Introduction to convex optimization - Boyd (Chapter 5) Weblink to the book

27-10-2017 Application of convex optimization to SVMs. KKT conditions and solution to problem. Definition of support vectors. Support vector machine for overlapping classes. Trade off in regularization and training loss. PRML - Bishop (Chapter 7)

slides
30-10-2017 SVM appication for classification. Support vector regression. Forumulation and KKT conditions. Introduction to neural networks. Parameter learning using gradient descent (scalar case). PRML - Bishop (Chapter 7)

slides
3-11-2017 Gradient descent vector case. Types of activation functions. XOR problem with NNs. Need for deep architecture neural networks. Deep Learning - IY (Chapter 6)

4-11-2017 Learning in Neural networks. First order methods - Method of steepest descent. Curvature and Hessians. Second order method - Newton method. Discussion on complexity of learning algorithms Deep Learning - IY (Chapter 4), Neural Networks - Bishop (Chapter 4,7)

6-11-2017 Back progation algorithm for learning in deep networks. Linear neuron with MSE algorithm. Disadvantages and limitations of gradient descent algorithm
Neural Networks - Bishop (Chapter 6)

08-11-2017 Second Mid-term Exam.
10-11-2017 Types of non-linearities used. Cost function for regression and classification. Output activation function used in regression and classification. Equivalance between regression with MSE and classficiation with CE using softmax output activations.
Neural Networks - Bishop (Chapter 6)

13-11-2017 Learning and Generarilzation issues in Neural networks. Decomposing the MSE into bias and variance. Discussion on bias variance tradeoff. Improving learning with regularization.
Neural Networks - Bishop (Chapter 9)

slides
14-11-2017

Assignment #5. Due on 24-11-2017. Analytical part submitted in class. Coding part submitted via e9205mlsp2017 aT gmail doT com.

HW5
Data For HW5
15-11-2017 L2 weight regularization, early stopping and training with added noise in the input data. Committees of neural networks. System combination methods and optimization.
Neural Networks - Bishop (Chapter 9)

slides
17-11-2017 Improving the speed of convergence of gradient descent with momentum. Convolutional neural networks. Kernels, pooling and sub-sampling. Comparision of CNNs and DNNs. Weight sharing and parameter learning.
Deep Learning - IY (Chapter 9)

18-11-2017 Understanding the learning in deep layers of CNNs. Recurrent networks. Backpropagation in time for RNN parameter learning. Various RNN architectures - teacher forcing, sequence-to-vector and bi-directional RNNs.
Deep Learning - IY (Chapter 10)

slides
20-11-2017 Long short term memory networks. Deep unsupervised learning - Restricted Boltzmann Machines (RBMs). Conditional independence in RBMs. Learning in RBM with maximum likelihood. Postive and negative partition function. Gibbs sampling and contrastive divergence approximations.
Deep Learning - IY (Chapter 18,20)

slides
                                                                                           
Top