Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
babyweights.txt		babyweights.txt
linear_models.ipynb		linear_models.ipynb
model_evaluation.ipynb		model_evaluation.ipynb
users.tsv.gz		users.tsv.gz

README.md

This week covers:

More statistics: hypothesis testing, effect sizes, and regression

Day 1

Hypothesis testing (cont'd)

We reviewed hypothesis testing and power calculations on the whiteboard (see notes)
We went over the quiz in Mindless Statistics by Gigerenzer about common misconceptions around testing and p-values
Finish up last week's stats exercises

References

Understanding Statistical Power and Significance Testing
Calculating the power of a test
The American Statistical Association's statement on p-values by Wasserstein & Lazar
Inference by eye by Cumming and Finch
Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations by Greenland et al.
The Insignificance of Significance Testing by Johnson
The Insignificance of Null Hypothesis Significance Testing by Gill

Day 2

Field trip

We took a field trip to the American Museum of Natural History to visit there AstroCom data science program!

Day 3

Effect sizes

We talked about false discoveries and effect sizes on the whiteboard (see notes)
Replicate and extend the results of the Google n-grams "culturomics" paper using the template here

References

Why Most Published Research Findings Are False
Felix Schönbrodt's blog post and shiny app on misconceptions about p-values and false discoveries
Interpreting Cohen's d effect size
The New Statistics: Why and How by Cummings
A guide on effect sizes and related blog post

Day 4

Regression

We introduced regression and derived the best-fit parameters for a simple linear model
See slides for a high-level framing and notes for derivation
Read Chapter 5 of Intro to Stats with Randomization and Simulation, do exercises 5.20, 5.29
Read Section 3.1 of Intro to Statistical Learning, do Lab 3.6.2
See if you can reproduce the table in ISRS 5.29 using the original dataset

References

This interactive shiny app on manual model fitting
Chapters 1 and 2 of Advanced Data Analysis from an Elementary Point of View

Day 5

Regression (cont'd)

See this notebook on fitting and visualizing linear models and this notebook on model evaluation
Read Sections 6.1 through 6.3 of Intro to Stats with Randomization and Simulation
Do Exercises 6.1, 6.2, and 6.3, and use the original data set in babyweights.txt, taken from here, to reproduce the results from the book
Read Sections 3.2 and 3.3 of Intro to Statistical Learning
Do Labs 3.6.3 through 3.6.6

References

A visualization of ordinary least squares regression
The "Model Basics" and "Model Building" Chapters in R for Data Science (Chapters 18 and 19 in the print edition, Chapters 23 and 24 online)
The modelr and tidymodels packages in R