STOR 767 Course Information

Class Meetings: Not offered this semester.

Registration: Enrollment and registration for the course is handled by Christine Keat in the Department of Statistics and Operations Research.  Ms. Keat can be reached by email at

Instructor:  Andrew B. Nobel, Department of Statistics and Operations Research

Office: Hanes 308   Email:    Phone: 919.962-1352.

Office Hours: TBA

Course Prerequisites:  STOR 634, 654, and 664.  It will be helpful, but not necessary, if students already have some familiarity with machine learning methods for classification and linear regression.

Overview: Generally speaking, machine learning is the study and development of methods that use existing data to make decisions or predictions about new data, or that identify meaningful structure in a data set.  In most cases, machine learning methods are built around models that are not tailored to the specifics of the problem at hand.  Machine learning draws on basic ideas from a number of disciplines, including statistics, computer science, and optimization, and has points of contact with data mining and the study of “big data”.

Audience and Goals: STOR 767 is an intermediate graduate level course in machine learning.  The course is targeted towards PhD students in the Statistics and Operations Research (STOR) department, but may be appropriate for MS students in STOR, and for students in other departments with appropriate mathematical and statistical backgrounds.  The goal of the course is to introduce students to, and provide them with working knowledge of, some of the core theory and methods in machine learning.  We will discuss and explore key ideas of regularization, sparsity, and complexity, with an emphasis on topics that have a statistical component, or that are used in statistical applications.  In some cases, we will look carefully at the theory supporting a method or family of methods, and provide rigorous proofs.  In other cases, we will focus primarily on the method and state supporting theoretical results as needed.  The coverage of classical subjects such as ridge regression and logistic regression that are covered in introductory courses will be brief, so that more time can be devoted to modern topics such as compressed sensing and graphical models.

Textbook:  There is no official textbook for the course.  Lectures will be based on material from a variety of sources, including on-line tutorials and surveys.

Grading: The course grade will be based on homework assignments, and a final project.

Homework Assignments 50%
Final Project 50%

Policy for Homework Assignments: Homework will be posted on the course web page, and will be due periodically througout the semester. Assignments will be collected at the beginning of class on the day that they are due, so please be prepared to turn in your homework at that time.  Each assignment will be graded.  Late/missed assignments will receive a grade of zero.  In computing a student’s overall homework score the two lowest assignment scores will be dropped.  Students are allowed to work on the homework assignments with other students, but each student should prepare the final version of his/her assignment on their own.

Classroom Protocol: Please show up on time, as late arrivals tend to disturb those already present.  Reading of newspapers and the use of laptops, tablets, and phones, is not permitted during class.

Specific Prerequisites:

Statistics: Sample vs. population quantities.  Basic understanding of statistical inference, in particular, maximum likelihood estimation and the method of moments.

Real Analysis: Infima and suprema, limits, continuous functions, integration and differentiation of functions of several variables, Taylor series, convexity.

Probability: Joint and conditional densities and probability mass functions.   Definition and basic properties of expectations, variance, covariance, correlation.  Basic discrete and continuous distributions and their properties.  Inequalities: Jensen, Markov, Chebyshev, Hoeffding, and Bernstein.

Linear algebra: Vector spaces, dimension, subspaces, matrix addition and multiplication, determinant and trace, inverse, eigenvectors, eigenvalues, symmetric matrices, non-negative and positive definite matrices, rank.

Tentative Syllabus

Big Picture

Unsupervised vs. Supervised Learning

Close connections between learning methods and optimization problems

Importance and pervasiveness of convexity

Notions of complexity for a family of models,relations between complexity and inference

Sparsity and regularization

Primary Material

Overview of principal component analysis and singular value decomposition

Clustering and community detection: k-means, hierarchical clustering, spectral clustering

Convex sets and functions

Johnson-Lindenstrauss lemma

Classification: Bayes risk and Bayes rule, nearest neighbor rules, classification trees

Empirical risk minimization (ERM): Rademacher complexity, shatter coefficients and

VC dimension, error bounds

Support vector machines (SVM) and kernel machines

Boosting and bagging

Compressed sensing

Online learning and prediction: exponential weighting

Other Material

Graphical models

Variational inference

Penalized regression and the LASSO

First order methods for minimizing smooth convex functions

Stochastic approximation


Honor Code: Students are expected to adhere to the UNC honor code at all times. Violations of the honor code will be prosecuted.