Taught by Patrick Hebron at ITP, Fall 2015
Overview
This half-semester course aims to introduce machine learning, a complex and quickly evolving subject deserving of a far more intensive study. The goal of this course will be to open a preliminary investigation of the conceptual and technical workings of a few key machine learning models, their underlying mathematics, their application to real-world problems and their philosophical value in understanding the general phenomena of learning and experience.
Primary Sources
Before an advancement in machine learning is distilled into textbooks, tutorials, blogs and open-source implementations, it is generally introduced in the form of an academic research paper. Many of these papers can be found at Arxiv and the other sites listed in the Academic Research Tools section below. These documents are not easy to read - they often describe ideas using mathematical nomenclature and assume that the reader is already familiar with the subject. Yet, these research papers are the best way to access the current cutting edge within machine learning. For this reason, it is important to become familiar with the format and decyphering its contents. To aid this process, we will read and discuss a primary source research paper each week. The primary source readings are labeled as such in the syllabus below.
Resources
Required Text:
- Anderson, Britt. Computational Neuroscience and Cognitive Modelling: A Student's Introduction to Methods and Procedures. Los Angeles: SAGE, 2014.
Python Installation Resources:
Python Resources:
- Python Tutorials
- Python Visualizer
- Python for Programmers
- Introduction to NumPy
- NumPy Tutorial
- Matplotlib Examples
Math for Machine Learning:
- Some Basic Mathematics for Machine Learning by Iain Murray and Angela J. Yu
- Math for Machine Learning by Hal Daume
- Machine Learning Math Essentials Part I by Jeff Howbert
- Machine Learning Math Essentials Part II by Jeff Howbert
- Immersive Linear Algebra by J. Ström, K. Åström, and T. Akenine-Möller
- Linear Algebra by Khan Academy
- Probability and Statistics by Khan Academy
- Differential Calculus by Khan Academy
Academic Research Tools:
Going Further:
- Deep Learning Tutorials
- Awesome Deep Learning Resources
- Deep Learning Frameworks
- Deep Learning: An MIT Press book in preparation by Yoshua Bengio, Ian Goodfellow and Aaron Courville
- Coursera: Neural Networks for Machine Learning by Geoffrey Hinton
- Coursera: Machine Learning by Andrew Ng
Syllabus
Week 1:
Class:
- Introductions
- Discussion:
- What is learning?
- Philosophical Framework:
- Observing the Rules of Chess from Richard Feynman, The Pleasure of Finding Things Out (27:15-30:10)
- Experience as Data Visualization
- A Kantian Analogy:
- Rationalism vs. Empiricism
- Noumena and Phenomena
- Time, Space and Causality
- Categories of Machine Learning (Overview):
- Supervised Learning
- Semi-Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
- Machine Memorization vs. Machine Learning:
- Logical Truth Tables (AND, OR, XOR, etc)
- Lookup Tables
- Encoding schemes:
- Run-length encoding (RLE)
- Getting Started in Python:
- Installation
- Variables
- Lists
- Loops
- Conditionals
- Functions
Homework:
Assignment:
- Implement Run-length encoding in Python. A decoder implementation is optional, but give it a shot.
Readings:
- The Nature of Code by Daniel Shiffman, Chapter 1: "Vectors" and Chapter 6.7: "The Dot Product"
- Computational Neuroscience and Cognitive Modelling, Chapter 9: "Neural Network Mathematics: Vectors and Matrices"
- Primary Source: Intelligent Machinery by Alan Turing
Optional:
- The History of Machine Learning from the Inside Out: A discussion with Geoffrey Hinton, Yoshua Bengio and Yann LeCun
- Building Smarter Machines: New York Times interactive on the history of machine learning
Week 2:
Class:
- Discussion and Questions:
- Run-length encoding homework
- Primary Source reading
- A very brief tour of two related studies:
- Fuzzy Logic
- Graph Theory
- Dimensions from 1 to N:
- Spatial dimensions
- Abstract dimensions
- The Pythagorean theorem and Euclidean distance
- A few applications of distance metrics:
- Gesture recognizers
- Recommendation engines
- Linear Algebra Primer (Part 1):
- Vector definition, notation, properties and common operations
- Working with vectors in Python and NumPy
Homework:
Assignment:
- Implement a simple recommendation engine in Python. The Programming Collective Intelligence chapter listed below provides a complete solution, but please use this code only as a reference. Try implementing your recommendation engine with NumPy's vector tools.
- Spend some time experimenting with vectors in Python and NumPy. Try out the operations we've discussed: plug in different values and vector dimensions, consider the results, tweak the values and repeat. The more you play with these operations, the more intuitive they will become.
Readings:
- Programming Collective Intelligence by Toby Segaran, Chapter 2: "Making Recommendations" (Requires NYU ID for online access)
- Primary Source: The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain by Frank Rosenblatt
Week 3:
Class:
- Discussion and Questions:
- Recommendation engine homework
- Primary Source reading
- Getting Started with Plotting in Python and Matplotlib
- Learning to draw a line in the sand:
- A brief look at k-means clustering:
- Classification as spatial partitioning
- Perceptrons:
- Historical overview
- The Perceptron Model
- Biological and electrical analogies
- Activation Functions
- Biases
- Training algorithm
- Limitations of the Perceptron model:
- The XOR problem
- Linear separability
- Discussion:
- What do these limitations tell us about the nature of learning?
- What conceptual alterations to the model might help to address these limitations?
Homework:
Assignment:
- Implement a Perceptron in Python.
Readings:
- The Nature of Code by Daniel Shiffman, Chapter 10: "Neural Networks"
- Computational Neuroscience and Cognitive Modelling, Chapter 11: "An Introduction to Neural Networks"
- Primary Source: Learning Internal Representations by Error Propagation by David Rumelhart, Geoffrey Hinton and Ronald Williams
Week 4:
Class:
- Discussion and Questions:
- Perceptron homework
- Primary Source reading
- Linear Algebra Primer (Part 2):
- Matrix definition, notation, properties and common operations
- Working with matrices in Python and NumPy
- Calculus Primer:
- Zeno's Paradox of the Tortoise and Achilles
- Integrals
- Derivatives
- Numerical Differentiation
- Partial Derivatives
- Multilayer Perceptrons:
- Learning as error correction
- Multilayer Perceptron Model
- Backpropagation Algorithm
Homework:
Assignment:
- This week, I will provide an "unrolled" Multilayer Perceptron implementation in Python. This code implements a three-layer MLP (input, output and one hidden layer) in an easy-to-read format, but does not attempt to provide a generalized architecture for adding additional hidden layers, etc. Using this code as a starting point, please implement a more generalized solution. Your code should allow the user to specify the size of each layer and the number of hidden layers as well as provide a clear API for training and testing the model on user-provided datasets.
- Code: mlp_unrolled.py
Readings:
- Computational Neuroscience and Cognitive Modelling, Chapter 2: "What Is a Differential Equation?"
- Computational Neuroscience and Cognitive Modelling, Chapter 3: "Numerical Application of a Differential Equation"
- Primary Source: Reducing the Dimensionality of Data with Neural Networks by Geoffrey Hinton and R. R. Salakhutdinov
Optional:
- How to Multiply Matrices
- A Step by Step Backpropagation Example
- Calculus on Computational Graphs: Backpropagation
- Neural Networks Demystified
- Learning: Neural Nets, Back Propagation
Week 5:
Class:
- Discussion and Questions:
- Multilayer Perceptron homework
- Primary Source reading
- General Methodology for Working with Neural Networks:
- Identifying a "learning problem"
- Developing a network architecture
- Developing the training procedure
- Validating the learning model
- Working with Data:
- Activation functions and numeric domains
- Regression & Classification (Real-valued and Indexical Data)
- One-hot and One-cold encodings
- Slicing and time windows
- Splitting datasets for training and testing
- Choosing representative datasets
- Practical limitations in supervised learning
- Index, Icon and Symbol:
- The abstract representation of ideas
- Charles Peirce's Theory of Signs
- Unsupervised Learning:
- Conceptual overview
- Learning as imitation
- Autoencoders and emergent encoding schemes
- Applications of dimensionality reduction
Homework:
Assignment:
- Using your Multilayer Perceptron implementation from last week, identify a learning problem and dataset that is of interest to you, determine how to integrate this data with the MLP implementation and write any code necessary to train and test the MLP on your data. The UC Irvine Machine Learning Repository contains a number of interesting sample datasets to get you started. The Awesome Deep Learning resource list also links to a wide assortment of datasets.
Readings:
- A Beginner’s Guide to Restricted Boltzmann Machines
- Primary Source: To Recognize Shapes, First Learn to Generate Images by Geoffrey Hinton
Optional:
- Restricted Boltzmann Machines - Definition
- Deep Learning, Self-Taught Learning and Unsupervised Feature Learning
- Demystifying Unsupervised Feature Learning
- A Fast Learning Algorithm for Deep Belief Nets by Geoffrey Hinton and Simon Osindero
- A Practical Guide to Training Restricted Boltzmann Machines by Geoffrey Hinton
Week 6:
Class:
- Discussion and Questions:
- Multilayer Perceptron applications homework
- Primary Source reading
- Unsupervised Learning:
- Conceptual Overview
- Restricted Boltzmann Machines:
- Architectural overview
- Persistent Contrastive Divergence
- Contrastive Divergence
- Implementation:
- Deep Belief Networks:
- Architectural overview
- Integrating supervised and unsupervised learning
- Discussion:
- Phenomenological implications of Deep Learning
Homework:
Assignment:
- Final assignment: Identify an applied learning problem that is of interest to you. Consider the problem, what sort of learning algorithm you would use to address it, what auxillary tools would be needed and so forth. Try to identify challenges, limitations, etc. Also, do some research into if/how others have tried to solve a similar problem. What tools did they use? Put together a brief presentation that describes your idea and outlines what further research and implementation work would be required to realize this idea. Alternately, you can implement a solution yourself, though this is not required.
Readings:
- Primary Source: Distributed Representations of Words and Phrases and their Compositionality by Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean.
Week 7:
Class:
- Discussion and Questions:
- Primary Source reading
- Discussion of other learning models and their applications:
- Word embeddings (word2vec)
- Recurrent networks
- Convolutional networks
- Generative applications
- Student Presentations
- Follow-up Discussion:
- What is learning?
- Final Thoughts