CS 4900/5900: Machine Learning

Fall 2017

This course will give an overview of the main concepts, techniques, and algorithms underlying the theory and practice of machine learning. The course will cover the fundamental topics of classification, regression and clustering, and a number of corresponding learning models such as perceptrons, logistic regression, linear regression, Naive Bayes, nearest neighbors, and Support Vector Machines. The description of the formal properties of the algorithms will be supplemented with motivating applications in a wide range of areas including natural language processing, computer vision, bioinformatics, and music analysis.

The students are expected to be comfortable with programming and familiar with basic concepts in linear algebra and statistics. Relevant background material in linear algebra, probability theory and information theory will be made available during the course.

- Syllabus & Introduction
- Hand notes Aug 29, Aug 31 one, Aug 31 two.

- Linear Regression and L2 Regularization
- Hand notes Sep 7, Sep 12.
- Athens houses: training samples, test samples, and visualization code.

- Linear algebra and optimization in Python
- Gradient Descent Algorithms
- Hand notes Sep 28 one, Sep 28 two, Sep 28 three, Sep 28 four.
- An overview of gradient descent optimization algorithms, Sebastian Ruder, CoRR 2016
- Animations of Gradient Descent Algorithms, Alec Radford, 2014

- Logistic Regression, Maximum Likelihood, and Maximum Entropy
- Hand notes Oct 3 one, Oct 3 two.
- A Maximum Entropy Approach to Natural Language Processing, Adam Berger, Vincent Della Pietra and Stephen A. Della Pietra, Computational Linguistics, 1996

- Fisher Linear Discriminant
- Perceptrons and Kernels
- Large Margin Classification Using the Perceptron Algorithm, Yoav Freund and Robert E. Schapire, Machine Learning 1999
- New ranking algorithms for parsing and tagging: Kernels over Discrete Structures, and the Voted Perceptron, Michael Collins and Nigel Duffy, ACL 2002

- Support Vector Machines
- An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Nello Cristianini and John Shawe-Taylor [Available online through library.ohiou.edu]
- Support Vector Machines [Trends and Controversies] , Marti Hearst, Susan Dumais, Edgar Osuna, John Platt, Bernhard Scholkopf, IEEE Intelligent Systems, 13(4), 1998
- A Tutorial on Support Vector Machines for Pattern Recognition, Christopher J. C. Burges, Data Mining and Knowledge Discovery 1998

- James H. Martin's Introduction to probabilities
- Jason Eisner's equestrian Introduction to probabilities
- Inderjit Dhillon's Linear Algebra Background
- Strang's Video Lectures on Linear Algebra
- Convex Optimization, Stephen Boyd and Lieven Vandenberghe, Cambridge University Press 2004
- Mike Brookes' Matrix Reference Manual
- Petersen et al.'s The Matrix Cookbook

- scikit-learn Machine Learning in Python
- Weka Data Mining Software in Java
- SVM
Implementation of SVMs in C^{light} - LIBSVM Implementation of SVMs in C++ and Java
- MALLET Java implementations of logistic regression, HMMs, linear chain CRFs, and other ML models.
- LibSVM applet demonstrating SVMs.
- k-Nearest Neighbor short animated video, by Antal van den Bosch