Machine Learning Capstone

Description

In this Machine Learning Capstone course, you will be using various Python-based machine learning libraries such as Pandas, scikit-learn, Tensorflow/Keras, to:

• build a course recommender system,
• analyze course related datasets, calculate cosine similarity, and create a similarity matrix,
• create recommendation systems by applying your knowledge of KNN, PCA, and non-negative matrix collaborative filtering,
• build similarity-based recommender systems,
• predict course ratings by training a neural network and constructing regression and classification models,
• build a Streamlit app that displays your work, and
• share your work then evaluate your peers.

What you will learn

Capstone Overview

In this module, you will be introduced to the idea of recommender systems in the first video. All labs in subsequent modules are based on this concept. You will also be provided with an overview of the capstone project. In the last two exercises, you will obtain an IBM Cloud feature code and use that code to create an IBM Watson Studio account.

Exploratory Data Analysis and Feature Engineering

In module 2, you will perform exploratory data analysis to find preliminary insights such as data patterns. You will also use it to check assumptions with the help of summary statistics and graphical representations of online course-related data sets such as course titles, course genres, and course enrollments. Next, you will extract a word-count vector called a “bag of words” (BoW) from course titles and descriptions. The BoW feature is probably the simplest but most effective feature characterizing textual data. It is widely used in many textual machine learning tasks. Finally, you will apply the cosine similarity measurement to calculate the course similarity using the extracted BoW feature vectors.

Unsupervised-Learning Based Recommender System

In module 3, you will create three course recommendation systems using different methods. In lab 1, you will create a course recommendation system based on user profile and course genre matrices by computing an interest score for each course and recommend the courses with the highest interest scores. In the second lab, you will generate a course similarity matrix to create the recommendation system. In the third lab, you will implement a clustering-based recommender system algorithm using K-means clustering and principal component analysis based on group members’ course enrollment history. In labs four and five you will use collaborative filtering to make predictions about a user’s interest based on a collection of other users’ similar preferences. In lab 4, you will perform KNN-based collaborative filtering and in lab 5, you will use non-negative matrix factorization.

Supervised-Learning Based Recommender Systems

In this module, you will predict course ratings using neural networks. In the first lab, you will train neural networks to predict course ratings while simultaneously extracting users’ and items’ latent features. In lab 2, you will be given course interaction feature vectors as input data. Using regression analysis, you will calculate numerical rating scores that predict whether a student will audit or complete a course. Lab 3 is similar to lab 2 but instead of using regression you will use a classification model. You will extract user and item embedding feature vectors from a neural network. With those embedding feature vectors, you will create an interaction feature vector and use that to build a classification model. The model maps the interaction feature vector to a rating mode that predicts whether a learner will audit or complete a course.