Resources (StatsReview)

Hi! This page serves as a place for me to compile resources I built or collected from other places. Feel free to explore them

StatsReview

This is a GitHub Repository where I built to compile my statistical notes and tutorials. I always have an admiration for understanding the “mathy” detail behind these models and have a fear of misuse of models. However, this is by no means a pure mathematical guide on these topics as I’m not officially trained in math or theoretical statistics. Feel free to let me know if anything is incorrect and I will be happy to make changes. Hppy learning! 🙂

Review for Estimation Theory:

Sample mean and variance are typically intuitively introduced as summary statistics of descriptive statistics (maybe except for the n-1 for sample variance). However, their application becomes less straightforward when transitioning to inferential statistics, such as in z-tests or t-tests.

In this tutorial, I try to elucidate the basic procedure of statistical modeling (inferential statistics) as 4 steps: model specification, estimation, statistical inference and model diagnosis and evaluation. Specifically, the estimation procedure deals with the problem of how can we estimate some unknown parameters of the proposed DGP given sample data. For example, when estimating the expectation of height of NYU students from 10 students, one could calculate the sample mean, more naively assume it to be of 5’10”. This section introduces fundamental concepts such as estimation, estimands, estimators, and estimates, along with the characteristics of estimators that make some more effective than others (e.g., why calculating the sample mean is generally preferable to assuming a fixed height of 5’10”).

In the final part, we apply this framework to show how the sample mean and variance actually serves as estimators (specific, method of moment and least square estimators) for the population expectation and variance. This also elucidate the use of n−1 in variance and reveal situations where using n might be appropriate. Also, we will paradoxically show that whereas the sample variance on average correctly estimate the variance of the DGP, the standard deviation of sample, does not. This foundation paves the way for future topics, including statistical inference regarding sample means and variances (given our educated guesses, how confidence are we?), and more complex estimation challenges like least squares and maximum likelihood estimation in linear regression.

  • OLS estimators of regression & Gaussian-Markov Assumption (why we need them)
Review for Inference Theory:

In inference, the trio – hypothesis testing, effect size, and power analysis – all super important and super connected. I put together this handy guide because I noticed something kinda off about the way we usually learn statistics. Based on the statistical training experience I had, courses leave power analysis for dessert, like it’s an afterthought. Problem is, that makes it harder to get how it fits in with everything else, especially when you dive into the complicated models that involves juggling a bunch of tests at once.

So, what I’ve done here is stick to the basics. Think of it as a training wheels: we’re going to focus on the simplest type of hypothesis testing, the one-sample Z-test. I’ll walk through it step-by-step, breaking down the math stuff so it’s easier to digest with the aim to not just understand the what, but the why and how with formal language of math. I wrote this guide as it is always my hope to learn stats in a more rigorous ‘mathy’ way while also keeping it intuitive and accessible. And I believe that bring big theoretical concepts like power to a simple example can sometimes be the best way to learn. It’s like building a toy house before you actually build the real house – once you’ve got the fundamentals nailed, everything else kind of falls into place. So I hope this guide provide the logic and math detail behind the concepts in a way that’s way less intimidating and way more fun.

  • One-sample Chi-square test: test for variance