Name		Name	Last commit message	Last commit date
parent directory ..
.env		.env
README.md		README.md
Week2Day3.rmd		Week2Day3.rmd
combine_and_reshape_in_r.ipynb		combine_and_reshape_in_r.ipynb
day3week2.rmd		day3week2.rmd
intro_to_stats.pptx		intro_to_stats.pptx
linear_models.ipynb		linear_models.ipynb
model_evaluation.ipynb		model_evaluation.ipynb
regression.pdf		regression.pdf
statistical_inference.Rmd		statistical_inference.Rmd
statistical_inference.html		statistical_inference.html
statistically_significant_splits.Rmd		statistically_significant_splits.Rmd
statistically_significant_splits.html		statistically_significant_splits.html

README.md

This week covers:

More wrangling and plotting in R
Statistical inference
Regression
Overfitting / generalization

Day 1

Combining and reshaping data

Review combine_and_reshape_in_r.ipynb on joins with dplyr and reshaping with tidyr

Plotting exercises

Finish up the Citibike plotting exercises in plot_trips.R, including the plots that involve reshaping data

Combining and reshaping exercises

Read chapters 12 and 13 of R for Data Science on tidyr and joins
Do the following exercises from R for Data Science:
- Section 12.2.1, exercise 2
- Section 12.3.3 exercises 1 and 3
Do part 1 of Datacamp's Cleaning Data in R tutorial
Additional references:
- The tidyr vignette on tidy data
- The dplyr vignette on two-table verbs for joins
- A visual guide to joins

Rmarkdown

Read Chapter 27 of R for Data Science on Rmarkdown
Do the following exercises:
- Section 27.2.1, exercises 1 and 2 (try keyboard shortcuts: ctrl-shift-enter to run chunks, and ctrl-shift-k to knit the document)
- Section 27.3.1 exercise 3, using this file
- Section 27.4.7, exercise 1

Day 2

Sampling distributions and standard errors

See the Statistical Inference & Hypothesis Testing slides
Review the "Estimating a proportion" section of the statistical inference Rmarkdown file (preview the output here)
Read Chapter 4 of an Introduction to Statistical Thinking (With R, Without Calculus) IST and do questions 4.1 and 4.2. Feel free to execute code in the book along the way.
Read Chapter 6 of IST on the normal distribution and do question 6.1

References

Chapter 1 of the online textbook Intro to Stat with Randomization and Simulation (ISRS)
Interactive demos:
- From the Seeing theory site:
- An interactive tutorial on sampling variability in polling
- Student t-distribution
Some notes on expected values and variance, with proofs of their properties
- Expected value, click through on "linearity of expectation" for proof
- Variance

Day 3

Hypothesis testing

Read Chapter 7 of IST on sampling distributions and do exercise 7.1
Read Chapter 9 of IST and do exercise 9.1
Read Chapter 10 of IST and do exercises 10.1 and 10.2
Review the "Hypothesis testing" section of the statistical inference Rmarkdown file (preview the output here)
Also check out the this analysis of the color distribution of M&Ms that we discussed

References

See the relevant part of these lecture notes on statistics by simulation
Statistics for Hackers by VanderPlas (slides, video)
See section 4 of Mindless Statistics and this article for some warnings on misinterpretations of p-values

Day 4

Power, effect sizes, and the replication crisis

See this post and the related lecture notes on effect sizes and the replication crisis
See this notebook on statistical vs. practical significance
There's also an interactive version, play with it and see if you understand what's going on!
Read Chapter 2 of the online textbook Intro to Stat with Randomization and Simulation (ISRS) and do exercises 2.2 and 2.6
Read Sections 3.1 and 3.2 of ISRS
Do exercise 9.2 in IST

References

Understanding Statistical Power and Significance Testing
Calculating the power of a test
The American Statistical Association's statement on p-values by Wasserstein & Lazar
Inference by eye by Cumming and Finch
Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations by Greenland et al.
The Insignificance of Significance Testing by Johnson
The Insignificance of Null Hypothesis Significance Testing by Gill
Why Most Published Research Findings Are False
Felix Schönbrodt's blog post and shiny app on misconceptions about p-values and false discoveries
Interpreting Cohen's d effect size
The New Statistics: Why and How by Cummings
A guide on effect sizes and related blog post

Day 5

Regression

Review the slides we covered in class
See this shiny app on model fitting and this tool for visualing least squares (Dan's version here is similar, but requires Flash)
Read Chapter 5 of Intro to Stats with Randomization and Simulation, do exercises 5.20 and 5.29
Read Section 3.1 of Intro to Statistical Learning, do Lab 3.6.2
See the notebook on linear models with the modelr from the tidyverse and this one on model evaluation

References

Detailed notes on derivations for ordinary least squares regression with multiple predictors
Chapter 14 of Introduction to Statistical Thinking
Formula syntax in R
The "Model Basics" and "Model Building" Chapters in R for Data Science (Chapters 18 and 19 in the print edition, Chapters 23 and 24 online)
The modelr and tidymodels packages in R
An animation of gradient descent and a related blog post

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week2

week2

README.md

Day 1

Combining and reshaping data

Plotting exercises

Combining and reshaping exercises

Rmarkdown

Day 2

Sampling distributions and standard errors

References

Day 3

Hypothesis testing

References

Day 4

Power, effect sizes, and the replication crisis

References

Day 5

Regression

References

Files

week2

Directory actions

More options

Directory actions

More options

Latest commit

History

week2

Folders and files

parent directory

README.md

Day 1

Combining and reshaping data

Plotting exercises

Combining and reshaping exercises

Rmarkdown

Day 2

Sampling distributions and standard errors

References

Day 3

Hypothesis testing

References

Day 4

Power, effect sizes, and the replication crisis

References

Day 5

Regression

References