Data Science at Scale – Capstone Project


In the capstone, students will engage on a real world project requiring them to apply skills from the entire data science pipeline: preparing, organizing, and transforming data, constructing a model, and evaluating results. Through a collaboration with Coursolve, each Capstone project is associated with partner stakeholders who have a vested interest in your results and are eager to deploy them in practice. These projects will not be straightforward and the outcome is not prescribed — you will need to tolerate ambiguity and negative results! But we believe the experience will be rewarding and will better prepare you for data science projects in practice.

What you will learn

Project A: Blight Fight

In this project, you will build a model to predict when a building is likely to be condemned. The data is real, the problem is real, and the impact is real.

Week 2: Derive a list of buildings

You are given sets of incidents with location information; you need to use some assumptions to group these incidents by location to identify specific buildings.

Week 3: Construct a training dataset

Construct a training set by associating each of your buildings with a ground truth label derived from the permit data.

Week 4: Train and evaluate a simple model

Use a trivial feature set to train and evaluate a simple model

What’s included