Fall 2019, BU, EC 500: Introduction to Online Learning, lecture notes available on my blog
Spring 2019, BU, EC 503: Introduction to Learning from Data
Spring 2018, SBU, CSE 592: Convex Optimization
Spring 2017, SBU, CSE 512: Machine Learning
Fall 2013, TTIC 31070 (CMSC 34500 Optimization/ BUSF 36903): Convex Optimization

CSE 592: Convex Optimization

Spring 2018, Time: Tues-Thurs 1:00PM-2:20PM, Location: Javits 103

Instructor: Francesco Orabona

Office hours: Tues-Thurs, 2:30PM-3:30PM

TA: Shahira Abousamra, office hours: Tues, 11:45AM-12:45PM, NCS 140

TA: Timothy Zhang, office hours: Weds, 1:00PM-2:00PM, NCS 230

Piazza: piazza.com/stonybrook/spring2018/cse592/home

The course will cover techniques in unconstrained and constrained convex optimization and a practical introduction to convex duality. The course will focus on (1) formulating and understanding convex optimization problems and studying their properties; (2) presenting and understanding optimization approaches; (3) understanding the dual problem; and (4) stochastic optimization methods. Theoretical analysis of convergence properties of methods will be presented. Examples will be mostly from data fitting, statistics, and machine learning.

All the course material will be posted on Piazza.

Prerequisites:

Linear algebra
Multivariable calculus
Proficiency in and regular access to Python

Specific Topics:

Formalization of optimization problems
Smooth unconstrained optimization: Gradient Descent, Conjugate Gradient Descent, Newton and Quasi-Newton methods; Line Search methods
Standard formulations of constrained optimization: Linear, Quadratic, Conic and Semi-Definite Programming
KKT optimality conditions
Lagrangian duality, constraint qualification, weak and strong duality
Fenchel conjugacy and its relationship to Lagrangian duality
Multi-objective optimization
Equality constrained Newton method
Log Barrier (Central Path) methods, and Primal-Dual optimization methods
Overview of the simplex method as an example of an active set method.
Overview of the first order oracle model, subgradient and separation oracles, and the Ellipsoid algorithm.
Large-scale subgradient methods: subgradient descent, mirror descent, stochastic subgradient descent.

Text Books

The required textbook for the class is:

[CO] Boyd and Vandenberghe: Convex Optimization (Cambridge University Press 2004)

The book is available online here. About 75% of the material covered in the class can be found in the above book.

Supplemental recommended books:

Bertsekas: Nonlinear Programming, 2nd Edition (Athena Scientific 1999)
[NO] Nocedal and Wright: Numerical Optimization, 2nd Edition (Springer 2006)
Bertsekas with Nedic and Ozdaglar: Convex Analysis and Optimization (Athena Scientific 2003)
[LNMCO] Nemirovski: Lecture Notes on Modern Convex Optimization (2005)
[EMCP] Nemirovski: Efficient Methods in Convex Programming(1994/5)

Requirements and Grading

There will be roughly 4 homework assignments, counting toward 20% of the grade. Assignments must be typed in LaTex or Word (not handwritten) and submitted electronically in PDF or coded in Python.

The midterm will count 15%, the second midterm 15% and the group project 40%, and the quizzes in class 10%.

Lecture Schedule

Tentative schedule, it will be updated regularly throughout the semester.

Date	Topic	Relevant Reading	Notes
1/23/2017	1 Introduction	[CO] Sections 1.1-1.4
1/25/2017	2 Review of Linear Algebra and Multivariable Calculus	Iain Murray's crib-sheet Matrix Cookbook
1/30/2017	3 Convex Analysis	[CO] Sections 2.1-2.3, 3.1-3.2
2/01/2017	4 Gradient Descent, Exact search, Backtracking	[CO] Sections 9.1-9.3
2/06/2017	5 Pre-conditioning, Newton method	[CO] Sections 9.5-9.6	HW1 out
2/08/2017	6 Conjugate Gradient, Quasi-Newton method	[NO] Sections 5.1, 6.1
2/13/2017	7 Constrained Optimization	[CO] Sections 4.2.3,4.2.4,4.3,5.1.1-5.1.4,5.2,5.4.1,5.5.1
2/15/2017	8 Duality	[CO] Sections 5.5.2
2/20/2017	9 KKT Optimality Conditions and Fenchel Conjugate	[CO] Sections 5.5.3, 3.3, 5.1.6	HW 1 due HW 2 out
2/22/2017	10 Semi-Definite Programming	[CO] Sections 4.6, 5.9
2/27/2017	11 Equality Constrained Optimization and Interior Point methods	[CO] Sections 10.1, 10.2, 11.1, 11.2, 11.3.1, 11.5
3/01/2017	12 Class cancelled
3/06/2017	13 Summary lecture, pre mid-term		HW2 due
3/08/2017	Mid-term exam
3/13/2017	Spring break
3/15/2017	Spring break
3/20/2017	14 Feasibility problems, Phase I methods, and Primal-dual method	[CO] Sections 11.4.1, 11.3.4, 11.7	Deadline proposal group project
3/22/2017	15 Pareto Optimality and the Simplex method	[CO] Section 4.7 [NO] Chapter 13
3/27/2017	16 Center of Mass and Ellipsoid method	[EMCP] Chapters 2-3	HW3 out
3/29/2017	17 Sub-gradient descent	[LNMCO] Section 5.3.1
4/03/2017	18 Mirror Descent	[LNMCO] Sections 5.3.2, 5.4.1, 5.4.2
4/05/2017	19 Dual Averaging, Stochastic Gradient Descent	[LNMCO] Section 5.4.2 [EMCP] Section 14.1 Optional: FTRL
4/10/2017	20 Subgradients and AdaGrad	Original AdaGrad paper Coordinate-wise online optimization Subgradients	HW3 due HW4 out
4/12/2017	21 Proximal Methods and Acceleration: FISTA	FISTA paper
4/17/2017	22 ADMM	Boyd's monograph	1 page report on project
4/19/2017	23 Non-Convex Unconstrained Optimization Problems	[NO] Section 3.4 Non-convex SGD
4/24/2017	Summary		HW4 due
4/26/2017	Second Midterm
5/01/2017	Project Presentation
5/03/2017	Project Presentation

Misc

Note: If you have a physical, psychological, medical or learning disability that may impact on your ability to carry out assigned course work, please contact the staff in the Disabled Student Services office (DSS), Room 133, Humanities, 632-6748/TDD. DSS will review your concerns and determine with you what accommodations are necessary and appropriate. All information and documentation of disability are confidential.

Note: Each student must pursue his or her academic goals honestly and be personally accountable for all submitted work. Representing another person's work as your own is always wrong. Any suspected instance of academic dishonesty will be reported to the Academic Judiciary. For more comprehensive information on academic integrity, including categories of academic dishonesty, please refer to the academic judiciary website at http://www.stonybrook.edu/uaa/academicjudiciary/.

CSE 512: Machine Learning

Spring 2017, Time: Tues-Thurs 5:30PM-6:50PM, Location: Javits 101

Instructor: Francesco Orabona

Office hours: Tues-Thurs 11:00AM-12:00PM

TA: Fatemeh Almodaresi

TA office hours: Wed 10:00AM-12:00PM, Room 256 in NCS

Note: The class is currently full.

Machine Learning is centered around automated methods that improve their own performance through learning patterns in data, and then use the uncovered patterns to predict the future and make decisions. Examples include document retrieval, image classification, spam filtering, face detection, speech recognition, decision making, and robot navigation. This course covers both the theoretical and the practical side of machine learning.

All the course material will be posted on Piazza.

Prerequisites:

Basic algorithms and data structures
Linear algebra
Multivariable calculus
Probability and statistics
Proficiency in and regular access to MATLAB

Specific Topics:

Linear algebra + Probability
Learning theory
Linear predictors
Bias-variance, model selection
Perceptron
Support Vector Machines
Convexity and convex problems
Duality and kernels
Convex Learning
Regularized linear predictors
Stochastic Gradient Descent
Boosting
Clustering
Dimensionality reduction
Generative models
Nearest Neighborhood
Neural Networks, Backprop
Deep learning

Text Books

The required textbook for the class is:

[UML] Shalev-Shwartz and Ben-David. Understanding Machine Learning: From Theory to Algorithms (Cambridge University Press 2014)

The book is available online here.

Supplemental recommended books:

Requirements and Grading:

There will be roughly bi-weekly homework assignments, counting toward 50% of the grade. Assignments must be typed (not handwritten) and submitted electronically in PDF. Collaborative work on the homeworks is encouraged, but each student must eventually write up the solution independently.

The midterm will count 20% and the final exam 30%.

Lecture Schedule

Updated regularly and subject to minor tweaks throughout the semester.

Date	Topic	Relevant Reading	Notes
1/24/2017	Introduction
1/26/2017	Review of Linear Algebra and Probability	Iain Murray's crib-sheet Matrix Cookbook
1/31/2017	PAC Learning	UML Chapters 2,3
2/02/2017	Agnostic PAC Learning	UML Chapter 4
2/07/2017	Linear Regression	UML Chapter 9, Section 12.1.1	HW 1 is out
2/09/2017	No class		Snow Day
2/14/2017	Bias-Variance Trade-off and Model Selection	UML Chapter 5, Section 11.2
2/16/2017	Perceptron	UML Section 9.1.2 CML Chapter 4
2/21/2017	Support Vector Machines	UML Section 15.1 (and its subsections), 15.2, 12.3 Optional: Burges' SVM tutorial	HW 1 is due HW 2 is out
2/23/2017	Duality and Kernels	CO, Sections 5.1.1, 5.1.2, 5.1.3, 5.2, 5.2.2, 5.2.3 UML, Sections 16.1, 16.2 Optional: Burges' SVM tutorial
2/28/2017	Regularization	UML, Sections 13.1 (and subsections), 25.1.3, 26.3, 26.4
3/02/2017	Convex Optimization	CML, Sections 7.4 UML, Sections 14.1, 14.2, 14.4.1 Optional: CO, Chapter 4, Sections 9.1, 9.2, 9.3
3/07/2017	Summary lecture, pre mid-term		HW 2 is due
3/09/2017	Mid-term exam		HW 3 is out
3/14/2017	Spring break
3/16/2017	Spring break
3/21/2017	Stochastic Gradient Descent	UML, Sections 12.1.2, 12.1.3, 12.2.2, 14.1, 14.3, 14.5 Optional: Zhang's paper on SGD
3/23/2017	Multiclass Classification	UML, Sections 17.1, 17.2, 17.3 and their subsections
3/28/2017	Boosting	CML, Section 13.2 Introduction (from this book)	HW 3 is due
3/30/2017	Clustering	CML, Sections 3.4, 15.1
4/04/2017	Dimensionality Reduction	UML, Sections 23.1, 23.2
4/06/2017	Generative Models	UML, Sections 24.1, 24.2, 24.3, 24.4	HW4 is out
4/11/2017	Nearest Neighborhood and Decision Trees	UML, Sections 18.1, 18.2, 19.1, 19.3
4/13/2017	Neural Networks and Backprop	UML, Sections 20.1, 20.2, 20.3.1, 20.5, 20.6
4/18/2017	Deep Learning Overview	Optional: Deep Learning Book
4/20/2017	Online Learning 1	UML, Sections 21.1, 21.2 Optional: Online Learning survey	HW 4 is due
4/25/2017	Online Learning, second part	UML, Sections 21.3 Optional: Online Learning survey
4/27/2017	Summary
5/02/2017	Talk on Parameter-free Online Learning	Optional: NIPS'16	Kaggle ends
5/04/2017	End-term exam

Misc

TTIC 31070 (CMSC 34500 Optimization/ BUSF 36903): Convex Optimization

This is a webpage for the Fall 2013 course at TTIC and the University of Chicago.

Course Description

Mondays and Thursday 3:00pm-4:20pm at TTIC 530 (located at 6045 S. Kenwood Ave, fifth floor)

Instructor: Francesco Orabona

Additional Lecturer: Nati Srebro (first 3 lessons)

TA: Zhiyong Wang

Mailing list of the course: The course will cover techniques in unconstrained and constrained convex optimization and a practical introduction to convex duality. The course will focus on (1) formulating and understanding convex optimization problems and studying their properties; (2) presenting and understanding optimization approaches; and (3) understanding the dual problem. Limited theoretical analysis of convergence properties of methods will be presented. Examples will be mostly from data fitting, statistics and machine learning.

Specific Topics:

Formalization of optimization problems
Smooth unconstrained optimization: Gradient Descent, Conjugate Gradient Descent, Newton and Quasi-Newton methods; Line Search methods
Standard formulations of constrained optimization: Linear, Quadratic, Conic and Semi-Definite Programming
KKT optimality conditions
Lagrangian duality, constraint qualification, weak and strong duality
Fenchel conjugacy and its relationship to Lagrangian duality
Multi-objective optimization
Equality constrained Newton method
Log Barrier (Central Path) methods, and Primal-Dual optimization methods
Overview of the simplex method as an example of an active set method.
Overview of the first order oracle model, subgradient and separation oracles, and the Ellipsoid algorithm.
Large-scale subgradient methods: subgradient descent, mirror descent, stochastic subgradient descent.

Text Books

The required textbook for the class is:

Boyd and Vandenberghe: Convex Optimization (Cambridge University Press 2004)

The book is available online here. About 75% of the material covered in the class can be found in the above book.

Supplemental recommended books:

Bertsekas: Nonlinear Programming, 2nd Edition (Athena Scientific 1999)
Nocedal and Wright: Numerical Optimization, 2nd Edition (Springer 2006)
Bertsekas with Nedic and Ozdaglar: Convex Analysis and Optimization (Athena Scientific 2003)
Nemirovski: Lecture Notes on Modern Convex Optimization (2005)
Nemirovski: Efficient Methods in Convex Programming(1994/5)

Requirements and Grading:

There will be roughly bi-weekly homework assignments, counting toward 30% of the grade. Assignments must be typed (not handwritten) and submitted electronically in PDF. Collaborative work on the homeworks is encouraged, but each student must eventually write up the solution independently.

There will also be several MATLAB programming and experimentation assignments, counting toward another 20% of the grade.

The remaining 50% of the grade will be based on a final exam.

Lectures and Required Reading:

Lecture I: Monday September 30th

Intro to course
Formulation of optimization problems
Feasibility, optimality, sub-optimality

Boyd and Vandenberghe Sections 1.1-1.4

Lecture II: Thursday October 3rd

Affine & convex sets, and operations that preserve convex sets
Affine & convex Functions
Separating and supporting hyperplanes, halfspaces, polyhedra
Notions of separation
Alpha-sublevel sets, epigraphs

Boyd and Vandenberghe Sections 2.1-2.3, 2.5, 3.1-3.2

Lecture III: Monday October 6th

Unconstrained optimization: descent methods; descent direction
Gradient descent
Line search: exact line search and backtracking line search
Analysis of gradient descent with exact and backtracking line search
Problems with badly conditioned functions; preconditioning
Newton's method

Boyd and Vandenberghe Sections 9.1-9.3, 9.5

Lecture IV: Thursday October 10th

Analysis of Newton's method
Self-concordance analysis of Newton's method
Conjugate gradient descent
Quasi-Newton methods: BFGS
Summary of unconstrained optimization

Boyd and Vandenberghe Sections 9.5-9.6, Nocedal and Wright Sections 5.1, 6.1

Lecture V: Monday October 14th

Constrained Optimization: Formulation and Optimality Condition
Linear Programming
The Lagrangian, the dual problem, weak duality
Slater's condition and strong duality
Summary of unconstrained optimization
Dual solutions as certificates for sub-optimality and infeasibility

Boyd and Vandenberghe Sections 4.2,4.3,5.1,5.2,5.4.1,5.5.1

Lecture VI: Thursday October 17th

Geometric interpretation of duality; Slater's condition
Dual of a linear program
Dual of a non-convex problem: max-cut clustering
Dual of least-squares solution to under-determined linear system
Least-squares solution: recovering the primal optimum from the dual optimum
Complimentary slackness

Boyd and Vandenberghe Sections 5.3.1, 5.5.2

Lecture VII: Monday October 21st

Example: Max-flow/min-cut
KKT conditions
Examples: water-filing, large-margin linear discrimination.
Expressing the dual in terms of Fenchel conjugates

Boyd and Vandenberghe Sections 5.5.3, 3.3, 5.1.6, 5.7.1

Lecture VIII: Thursday October 24th

Example: Logistic regression
Example: Minimum volume covering ellipsoid and the log-det function
Matrix inequalities, semi-definite constraints, and semi-definite programming
Examples: covariance estimation, fastest mixing Markov chain
Lagrangian duality for semi-definite constraints
Complimentary slackness and KKT optimality conditions for semi-definite constraints
Norm-constrained matrix factorization as an example of an SDP and its dual

Boyd and Vandenberghe Sections 4.6, 5.9

Lecture IX: Monday October 28th

Equality constrained Newton's method
Optimization of problems with implicit constraints
Log-barrier problems and the central path interior point method
Analysis of the log-barrier central path method
Log-barrier method for semi-definite constraints

Boyd and Vandenberghe Sections 10.1, 10.2, 11.1, 11.2, 11.3.1, 11.5, 11.6

Lecture X: Thursday October 31st

Feasibility problems and Phase I methods
Reducing Optimization to feasibility
Log-barrier and weakened KKT conditions
Primal-Dual Interior Point Methods, including infeasible start
Multi-objective problems and Pareto optimality

Boyd and Vandenberghe Sections 11.4.1, 11.4.3, 10.3.4, 11.7, 4.7

Lecture XI: Monday November 4th

Scalarization of multi-objective problems
The Simplex method

Boyd and Vandenberghe Sections 11.4.1, Nocedal and Wright Chapter 13

Lecture XII: Thursday November 7th

The first order oracle model: subgradients and seperation oracles
Classical first order methods: center-of-mass method, oracle lower bound, ellipsoid method

Chapters 1-3 of Nemirovksi's "Efficient Methods in Convex Programming"

Lecture XIII: Monday November 11th

Review and limitations of methods with polynomial convergence.
Large-scale, slowly converging, first order methods.
Sub-gradient descent with projection, step size and analysis for Lipschitz functions over a bounded domain.

Section 5.3.1 of Nemirovksi's "Lectures on Modern Convex Optimization"

Lecture XIV: Thursday November 14th

Gradient descent as a proximal point method.
From gradient descent to bundle methods.
Mirror Descent formulation and basic concepts: strong convexity with respect to a norm, dual norm, distance generating function and Bergman divergence
Analysis Mirror Descent

Sections 5.3.2 and 5.4.2 of Nemirovksi's "Lectures on Modern Convex Optimization"

Lecture XV: Monday November 18th

Partial linarization in Mirror Descent
More about sub-gradient descent: decreasing step-size
Primal-dual analysis of Mirror Descent: Dual Averaging
Choice of the distance generating function

Lecture XVI: Thursday November 21st

Stochastic Gradient Descent and Stochastic Optimization.
Overview of further results for first order methods: faster rates with strong convexity, faster rates with smoothness, acceleration with smoothness.

Section 14.1 of Nemirovksi's "Efficient Methods in Convex Programming"

Lecture XVII: Monday December 2nd

Course Summary

Assignments:

Written Assignment 1 (due October 21st)
Programming Assignment 1 (source code) (due October 28th)
Written Assignment 2 (due November 4th)
Written Assignment 3 (due November 11th)
Programming Assignment 2 (source code) (due November 20th)
Written Assignment 4 (due November 29th)