In the lectures series (Tue-Fri) I will discuss the essential data analysis activities of a scientist from a statistical point of view. My goal is to maintain a common thread of logic between problems that can be done with "pencil and paper" and those that involve complex simulations and more advanced computational techniques. While my examples will mainly come from particle physics, the message will be much more general.
First, I will translate the physicist’s terms of measurements, discovery, and limits into the statistical language of point estimates, hypothesis tests, and confidence intervals (and the machine learning language of regression, classification, and prediction uncertainty).
Next I will establish the properties we would like these statistical procedures to have, giving particular attention to the treatment of systematic uncertainties. I will take some time to discuss fundamental differences around the Bayesian and Frequentist perspectives, and contrast the two in the context of statistical decision theory.
Next, I will emphasize the importance of building statistical models for the data and discuss various strategies for doing so. With these principles in mind, I will discuss specific techniques for hypothesis tests and confidence intervals.
Finally, I will revisit these techniques when the predictions for the data come from a simulator and the likelihood function is intractable. With this in mind, I will discuss how machine learning can be used for simulation-based or likelihood-free inference.
First, I will translate the physicist’s terms of measurements, discovery, and limits into the statistical language of point estimates, hypothesis tests, and confidence intervals (and the machine learning language of regression, classification, and prediction uncertainty).
Next I will establish the properties we would like these statistical procedures to have, giving particular attention to the treatment of systematic uncertainties. I will take some time to discuss fundamental differences around the Bayesian and Frequentist perspectives, and contrast the two in the context of statistical decision theory.
Next, I will emphasize the importance of building statistical models for the data and discuss various strategies for doing so. With these principles in mind, I will discuss specific techniques for hypothesis tests and confidence intervals.
Finally, I will revisit these techniques when the predictions for the data come from a simulator and the likelihood function is intractable. With this in mind, I will discuss how machine learning can be used for simulation-based or likelihood-free inference.