This course establishes a basic understanding of supervised learning and Bayesian classifiers using the histogram as a starting point. It then covers the design and application of practically useful classifiers such as k-nearest neighbors, linear machines and decision trees. You will also learn concepts in unsupervised learning and clustering algorithms such as expectation maximization and k-means clustering. The course concludes with the application of neural networks in machine learning.
The course uses examples to guide you through foundational concepts, often employing live algorithms to facilitate visual understanding. Pseudocode will be provided for most of the algorithms covered. You are encouraged to use the pseudocode as a reference to create your own programs using any language you choose. In-class quizzes are utilized to gauge learning and group activities including discussion. Homework assignments are designed for in-depth practice.
- Histograms and Bayesian classifiers
- Principal component analysis
- Linear classifiers and regression
- Classifier performance evaluation
- Expectation maximization algorithm
- K-Means algorithm
- Hidden Markov models
- Ensemble learning and Decision trees
- Neural networks
* Skills Needed: Moderate computer programming ability in a programming language like Python, R, C++, Java, or Matlab. Elementary understanding of probability and statistics. Familiarity with basic programming constructs like variables, arrays, accessing elements in arrays, arithmetic, logic, branching, looping, strings, input/output, functions, and visualization.
If you are new to programming, it is recommended that you learn the fundamentals of programming in either Mathematica, Matlab or Python before enrolling in this course. A great free resource for learning Python can be found by searching "Microsoft edX Introduction to Python for Data Science".