Class Meetings: Not offered this semester.
Registration: Enrollment and registration for the course is handled by Christine Keat in the Department of Statistics and Operations Research. Ms. Keat can be reached by email at email@example.com.
Instructor: Andrew B. Nobel, Department of Statistics and Operations Research
Office: Hanes 308 Email: firstname.lastname@example.org Phone: 919.962-1352.
Office Hours: TBA
Course Prerequisites: STOR 634, 654, and 664. It will be helpful, but not necessary, if students already have some familiarity with machine learning methods for classification and linear regression.
Overview: Generally speaking, machine learning is the study and development of methods that use existing data to make decisions or predictions about new data, or that identify meaningful structure in a data set. In most cases, machine learning methods are built around models that are not tailored to the specifics of the problem at hand. Machine learning draws on basic ideas from a number of disciplines, including statistics, computer science, and optimization, and has points of contact with data mining and the study of “big data”.
Audience and Goals: STOR 767 is an intermediate graduate level course in machine learning. The course is targeted towards PhD students in the Statistics and Operations Research (STOR) department, but may be appropriate for MS students in STOR, and for students in other departments with appropriate mathematical and statistical backgrounds. The goal of the course is to introduce students to, and provide them with working knowledge of, some of the core theory and methods in machine learning. We will discuss and explore key ideas of regularization, sparsity, and complexity, with an emphasis on topics that have a statistical component, or that are used in statistical applications. In some cases, we will look carefully at the theory supporting a method or family of methods, and provide rigorous proofs. In other cases, we will focus primarily on the method and state supporting theoretical results as needed. The coverage of classical subjects such as ridge regression and logistic regression that are covered in introductory courses will be brief, so that more time can be devoted to modern topics such as compressed sensing and graphical models.
Textbook: There is no official textbook for the course. Lectures will be based on material from a variety of sources, including on-line tutorials and surveys.
Grading: The course grade will be based on homework assignments, and a final project.
Policy for Homework Assignments: Homework will be posted on the course web page, and will be due periodically througout the semester. Assignments will be collected at the beginning of class on the day that they are due, so please be prepared to turn in your homework at that time. Each assignment will be graded. Late/missed assignments will receive a grade of zero. In computing a student’s overall homework score the two lowest assignment scores will be dropped. Students are allowed to work on the homework assignments with other students, but each student should prepare the final version of his/her assignment on their own.
Classroom Protocol: Please show up on time, as late arrivals tend to disturb those already present. Reading of newspapers and the use of laptops, tablets, and phones, is not permitted during class.
Statistics: Sample vs. population quantities. Basic understanding of statistical inference, in particular, maximum likelihood estimation and the method of moments.
Real Analysis: Infima and suprema, limits, continuous functions, integration and differentiation of functions of several variables, Taylor series, convexity.
Probability: Joint and conditional densities and probability mass functions. Definition and basic properties of expectations, variance, covariance, correlation. Basic discrete and continuous distributions and their properties. Inequalities: Jensen, Markov, Chebyshev, Hoeffding, and Bernstein.
Linear algebra: Vector spaces, dimension, subspaces, matrix addition and multiplication, determinant and trace, inverse, eigenvectors, eigenvalues, symmetric matrices, non-negative and positive definite matrices, rank.
Unsupervised vs. Supervised Learning
Close connections between learning methods and optimization problems
Importance and pervasiveness of convexity
Notions of complexity for a family of models,relations between complexity and inference
Sparsity and regularization
Overview of principal component analysis and singular value decomposition
Clustering and community detection: k-means, hierarchical clustering, spectral clustering
Convex sets and functions
Classification: Bayes risk and Bayes rule, nearest neighbor rules, classification trees
Empirical risk minimization (ERM): Rademacher complexity, shatter coefficients and
VC dimension, error bounds
Support vector machines (SVM) and kernel machines
Boosting and bagging
Online learning and prediction: exponential weighting
Penalized regression and the LASSO
First order methods for minimizing smooth convex functions
Honor Code: Students are expected to adhere to the UNC honor code at all times. Violations of the honor code will be prosecuted.