STA2526H: Data Science and Machine Learning I

This course introduces the principles and practices of applied data science and machine learning in the context of finance and insurance. Topics include data extraction from structured and unstructured sources (text, images, audio, and geospatial data), data cleaning, integration, and transformation, feature engineering, and exploratory analysis. Students will learn frameworks for model development and monitoring, including handling missing data, categorical encoding, variable and model selection, and lifecycle management.

The course also includes an introduction to Generalized Linear Models (GLMs). The course places emphasis on data preparation and the overall model lifecycle, providing the foundation for model-building techniques covered in Data Science and Machine Learning II. Reproducible, code-driven workflows using Python and SQL are emphasized, supported by case studies from financial analytics and insurance applications.

0.50
St. George
In Class