Data Analytics Foundations for Accountancy II

Description

Welcome to Data Analytics Foundations for Accountancy II! I’m excited to have you in the class and look forward to your contributions to the learning community.

To begin, I recommend taking a few minutes to explore the course site. Review the material we’ll cover each week, and preview the assignments you’ll need to complete to pass the course. Click Discussions to see forums where you can discuss the course material with fellow students taking the class.
If you have questions about course content, please post them in the forums to get help from others in the course community. For technical problems with the Coursera platform, visit the Learner Help Center.
Good luck as you get started, and I hope you enjoy the course!

What you will learn

Course Orientation

You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.

Module 1: Introduction to Machine Learning

This module provides the basis for the rest of the course by introducing the basic concepts behind machine learning, and, specifically, how to perform machine learning by using Python and the scikit learn machine learning module. First, you will learn how machine learning and artificial intelligence are disrupting businesses. Next, you will learn about the basic types of machine learning and how to leverage these algorithms in a Python script. Third, you will learn how linear regression can be considered a machine learning problem with parameters that must be determined computationally by minimizing a cost function. Finally, you will learn about neighbor-based algorithms, including the k-nearest neighbor algorithm, which can be used for both classification and regression tasks.

Module 2: Fundamental Algorithms

This module introduces several of the most important machine learning algorithms: logistic regression, decision trees, and support vector machine. Of these three algorithms, the first, logistic regression, is a classification algorithm (despite its name). The other two, however, can be used for either classification or regression tasks. Thus, this module will dive deeper into the concept of machine classification, where algorithms learn from existing, labeled data to classify new, unseen data into specific categories; and, the concept of machine regression, where algorithms learn a model from data to make predictions for new, unseen data. While these algorithms all differ in their mathematical underpinnings, they are often used for classifying numerical, text, and image data or performing regression in a variety of domains. This module will also review different techniques for quantifying the performance of a classification and regression algorithms and how to deal with imbalanced training data.

Module 3: Practical Concepts in Machine Learning

This module introduces several important and practical concepts in machine learning. First, you will learn about the challenges inherent in applying data analytics (and machine learning in particular) to real world data sets. This also introduces several methodologies that you may encounter in the future that dictate how to approach, tackle, and deploy data analytic solutions. Next, you will learn about a powerful technique to combine the predictions from many weak learners to make a better prediction via a process known as ensemble learning. Specifically, this module will introduce two of the most popular ensemble learning techniques: bagging and boosting and demonstrate how to employ them in a Python data analytics script. Finally, the concept of a machine learning pipeline is introduced, which encapsulates the process of creating, deploying, and reusing machine learning models.

Module 4: Overfitting & Regularization

This module introduces the concept of regularization, problems it can cause in machine learning analyses, and techniques to overcome it. First, the basic concept of overfitting is presented along with ways to identify its occurrence. Next, the technique of cross-validation is introduced, which can mitigate the likelihood that overfitting can occur. Next, the use of cross-validation to identify the optimal parameters for a machine learning algorithm trained on a given data set is presented. Finally, the concept of regularization, where an additional penalty term is applied when determining the best machine learning model parameters, is introduced and demonstrated for different regression and classification algorithms.

What’s included