Introduction to maximum likelihood and Bayesian statistics (spring semester)

Principal lecturer: Petr Keil

Course materials: https://github.com/petrkeil/ML_and_Bayes_2021_CZU

Summary

The course aims to show an alternative view on applied statistics, one that claims that statistics can work as a modular kit in which several simple blocks can be used to analyze problems of any complexity. The course will show how to do statistics in a way that is more general, and creative, than the classical categories such as regression or ANOVA. First, the course provides introduction to maximum likelihood and Bayesian inference. Second, it will introduce critical concepts unifying all parametric statistical techniques; these are: likelihood, probability distributions, and the distinction between model, data, parameters, inference, and predictions. We will also look at inner workings of hierarchical (mixed effect) models and occupancy models with imperfect detection and latent variables. Finally, participants will learn to specify and fit models using likelihood maximization in R, and using MCMC in OpenBUGS, JAGS, or STAN. The course emphasizes practical issues over philosophical ones. We will use mainly examples from ecology and environmental science.

Requirements

Students will need a computer with R installed. I assume basic familiarity with R, including data plotting and elementary statistical models such as regression or ANOVA. Participants should know how to set a working directory, how to import and export data, how R indexing works (the square brackets [,]), and they should be familiar with the following types of objects: vector, data.frame, and list. Knowledge of loops is an advantage.

Contents

  1. Pros and cons of ML and Bayes, necessary skills & literature overview. Intro to likelihood: Probability density, probability mass, cumulative probability, likelihood, and deviance, all on Normal example.
  2. Poisson and Binomial distributions, their logic, likelihood maximization.
  3. Prior and posterior probability. The principle of MCMC sampling. Simplest model in JAGS. Credible intervals, Bayesian uncertainty.
  4. Generalized linear models in ML and Bayesian setting - linear regression, logistic regression, Poisson regression.
  5. Connection between t-test, ANOVA, and regression, all that in BUGS.
  6. Hierarchical (mixed-effects) models, random effects vs. fixed effects, latent variables, informative vs. non-informative priors.
  7. How to model space and time - autoregressive models, splines, eigenvectors.
  8. Custom models: occupancy modelling, imperfect detection.
  9. Model selection, model evaluation, information theory criteria, handling uncertainty, Bayesian credible intervals, prediction intervals.
  10. INLA, STAN, big data, and all the cutting-edge developments.
  11. Exercises + independent work.
  12. Exercises + independent work.