Predictive analytics has a longstanding tradition in medicine. Developing better prediction models is a critical step in the pursuit of improved health care: we need these tools to guide our decision-making on preventive measures, and individualized treatments. In order to effectively use and develop these models, we must understand them better. In this course, you will learn how to make accurate prediction tools, and how to assess their validity. First, we will discuss the role of predictive analytics for prevention, diagnosis, and effectiveness. Then, we look at key concepts such as study design, sample size and overfitting.
Furthermore, we comprehensively discuss important modelling issues such as missing values, non-linear relations and model selection. The importance of the bias-variance tradeoff and its role in prediction is also addressed. Finally, we look at various way to evaluate a model – through performance measures, and by assessing both internal and external validity. We also discuss how to update a model to a specific setting.
Throughout the course, we illustrate the concepts introduced in the lectures using R. You need not install R on your computer to follow the course: you will be able to access R and all the example datasets within the Coursera environment. We do however make references to further packages that you can use for certain type of analyses – feel free to install and use them on your computer.
Furthermore, each module can also contain practice quiz questions. In these, you will pass regardless of whether you provided a right or wrong answer. You will learn the most by first thinking about the answers themselves and then checking your answers with the correct answers and explanations provided.
This course is part of a Master’s program Population Health Management at Leiden University (currently in development).