Machine learning and data mining are at the center of a powerful movement. Many industries depend on practitioners of machine learning to create products that parse, reduce, simplify and categorize data, and then extract actionable intelligence from that data. Professionals who are familiar with machine learning, a key technology driving Big Data, secure a competitive edge in exciting careers in the data sciences. In this course, you will learn machine learning concepts, terms and methodology, and gain an intuitive understanding of the mathematics underlying it by building actual applications. The machine learning algorithms you’ll learn can be used in real-world applications such as search engines, image analysis, bioinformatics, industrial automation, speech recognition, and more.
This course establishes a basic understanding of supervised learning and Bayesian classifiers using the histogram as a starting point. It then covers the design and application of practically useful classifiers such as k-nearest neighbors, linear machines and decision trees. You will also learn concepts in unsupervised learning and clustering algorithms such as expectation maximization and k-means clustering. The course concludes with the application of neural networks in machine learning.
The course uses examples to guide you through foundational concepts, often employing live algorithms to facilitate visual understanding. Pseudocode will be provided for most of the algorithms covered. You are encouraged to use the pseudocode as a reference to create your own programs using any language you choose. In-class quizzes are utilized to gauge learning and group activities including discussion. Homework assignments are designed for in-depth practice.
- Histograms and Bayesian classifiers
- Principal component analysis
- Linear classifiers and regression
- Classifier performance evaluation
- Expectation maximization algorithm
- K-Means algorithm
- Hidden Markov models
- Ensemble learning and Decision trees
- Neural networks
* Skills Needed: Moderate computer programming ability in a programming language like Python, R, C++, Java, or Matlab. Elementary understanding of probability and statistics. Familiarity with basic programming constructs like variables, arrays, accessing elements in arrays, arithmetic, logic, branching, looping, strings, input/output, functions, and visualization.
If you are new to programming, it is recommended that you learn the fundamentals of programming in either Mathematica, Matlab or Python before enrolling in this course. A great free resource for learning Python can be found by searching "Microsoft edX Introduction to Python for Data Science".