Statistics for Data Science

Master of Information and Data Science (MIDS)

Course Overview

Course Description

The goal of this course is to provide students with a foundational understanding of classical statistics and how it fits within the broader context of data science. Students will learn to apply the most common statistical procedures correctly, checking assumptions and responding appropriately when they appear violated. Emphasis is placed on different practices that constitute an effective analysis, including formulating research questions, operationalizing variables, exploring data, selecting hypothesis tests, and communicating results.

Learning Objectives

In this class, you will:

  • Introduction to probability theory;

  • The logic of hypothesis testing;

  • Classical linear regression;

  • Building of regression models in the context of description and of causal inference;

  • Demonstrate your knowledge through analytical papers, writing exercises and two term project;

  • Using the open-source language R to analyze real-world data.

Course Requirements

  • Proficiency with calculus (including an ability to take simple derivatives and integrals)

  • Familiarity with basic matrix operations

  • Ability to write proofs

SYLLABUS

Syllabus