CS 6830: Machine Learning

Fall 2016

Machine Learning is concerned with the design and analysis of algorithms that enable computers to automatically find patterns in the data. This introductory course will give an overview of the main concepts, techniques and algorithms that are relevant for the theory and practice of machine learning. The course will cover the fundamental topics of classification, regression and clustering, starting with simple learning models such as perceptrons, decision trees and logistic regression, and ending with more advanced models including Support Vector Machines, Conditional Random Fields and Bayesian Networks. The description of the formal properties of the algorithms will be supplemented with motivating applications in a wide range of areas including natural language processing, computer vision, bioinformatics and music analysis.

The students are expected to be comfortable with programming and to exhibit a basic level of mathematical dexterity. Relevant background material in linear algebra, probability theory and information theory will be made available during the course.

- Syllabus & Introduction
- Regression with Linear Models
- An overview of gradient descent optimization algorithms, Sebastian Ruder, 2016
- Animations of Gradient Descent Algorithms, Alec Radford, 2014

- Fisher Linear Discriminant
- Perceptrons and Kernels
- Large Margin Classification Using the Perceptron Algorithm, Yoav Freund and Robert E. Schapire, Machine Learning 1999

- Support Vector Machines
- An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Nello Cristianini and John Shawe-Taylor [Available online through library.ohiou.edu]
- Support Vector Machines [Trends and Controversies] , Marti Hearst, Susan Dumais, Edgar Osuna, John Platt, Bernhard Scholkopf, IEEE Intelligent Systems, 13(4), 1998
- A Tutorial on Support Vector Machines for Pattern Recognition, Christopher J. C. Burges, Data Mining and Knowledge Discovery 1998

- Nearest Neighbor Methods
- Feature Selection
- JMLR Special Issue on Variable and Feature Selection , Isabelle Guyon and Andre Elisseeff (editors), JMLR, 2003
- Gene Selection for Cancer Classification using Support Vector Machines for Pattern Recognition, Guyon, Weston, and Barnhill, Machine Learning, v46, 2002

- Decision Trees
- Naive Bayes
- Naive Bayes and Logistic Regression, new chapter in Tom Mitchell, Machine Learning, 2005

- Logistic Regression
- A Maximum Entropy Approach to Natural Language Processing, Adam Berger, Vincent Della Pietra and Stephen A. Della Pietra, Computational Linguistics, 1996

- Hidden Markov Models
- Markov Models, chapter 9 in Chris Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA, May 1999
- A tutorial on Hidden Markov Models and selected applications in speech recognition, Lawrence R. Rabiner, Proceedings of the IEEE 77 (2), 1989

- Conditional Random Fields
- An Introduction to Conditional Random Fields for Relational Learning, Charles Sutton and Andrew McCallum, 2007

- PCA, Clustering

- James H. Martin's Introduction to probabilities
- Jason Eisner's equestrian Introduction to probabilities
- Inderjit Dhillon's Linear Algebra Background
- Mike Brookes' Matrix Reference Manual
- Strang's Video Lectures on Linear Algebra
- Convex Optimization, Stephen Boyd and Lieven Vandenberghe, Cambridge University Press 2004

- Weka Data Mining Software in Java
- scikit-learn Machine Learning in Python
- SVM
Implementation of SVMs in C^{light} - LIBSVM Implementation of SVMs in C++ and Java
- MALLET Java implementations of logistic regression, HMMs, linear chain CRFs, and other ML models.
- LibSVM applet demonstrating SVMs.
- k-Nearest Neighbor short animated video, by Antal van den Bosch