This course builds on Statistical Learning I to study modern machine learning methods from a statistical perspective. The course covers classical machine learning approaches and unsupervised learning, then develops probabilistic and representation learning frameworks used in contemporary large-scale systems. Gradient-based likelihood optimization for complex models, variational inference, stochastic variational inference, Bayesian neural networks for uncertainty estimation, and representation learning will be covered. Large language models are introduced as autoregressive probabilistic models trained by maximum likelihood. The course emphasizes reproducible computation with robust model evaluation.