This week covers:
- More wrangling and plotting in R
- Statistics
- Review visualization_with_ggplot2.ipynb for an introduction to data visualization with ggplot2
- Review combine_and_reshape_in_r.ipynb on joins with dplyr and reshaping with tidyr
- Read chapters 9 and 10 of R for Data Science on tidyr and joins
- Do the following exercises from R for Data Science:
- Exercise 2 on page 151
- Exercise 1 and 3 on page 156
- Do part 1 of Datacamp's Cleaning Data in R tutorial
- Additional references:
- The tidyr vignette on tidy data
- The dplyr vignette on two-table verbs for joins
- A visual guide to joins
- Read Chapter 21 of R for Data Science on Rmarkdown
- Do the following exercises:
- Exercises 1 and 2 on page 426 (try keyboard shortcuts: ctrl-shift-enter to run chunks, and ctrl-shift-k to knit the document)
- Exercise 3 on page 428, using this file
- Exercise 1 on page 434
- Use the download_movielens.sh script to download the MovieLens data
- Fill in code in the movielens.Rmd file to reproduce the plots from Wednesday's slides
- Sketch out (on paper) how to generate figure 2 from The Anatomy of the Long Tail
- Write code to do this in the last section of movielens.Rmd
- See the Statistical Inference & Hypothesis Testing slides
- Review the "Estimating a proportion" section of the statistical inference Rmarkdown file (preview the output here)
- Read Chapter 7 of Introduction to Statistical Thinking (With R, Without Calculus) (IST) for a recap of sampling distributions. Feel free to execute code in the book along the way.
- Do question 7.1
- Read Chapter 9 of of IST
- Do questions 9.1 and 9.2
- For background:
- Chapter 4 has a good review of population distributions, expectations, and variance
- Chapter 5 has a recap of random variables
- Chapter 6 has more information on the normal distribution
- Chapters 1 and 2 of the online textbook Intro to Stat with Randomization and Simulation
- Statistics for Hackers by VanderPlas (slides, video)
-
We talked about expected values, variance, standard deviation, and standard errors on the whiteboard
-
Interactive demos:
-
Read Chapters 9, 10, and 11 of IST
-
Finish up yesterday's exercises from IST
- Dan gave a guest lecture on experiments
- Read Chapters 4 and 6 of IST
- Do the following exercises from IST:
- Questions 4.1 and 4.2
- Question 6.1
- Example 5 in Section 8.3.5
- Questions 10.1 and 10.2
- Questions 11.1 and 11.3
- Read Chapter 2 of Intro to Stat with Randomization and Simulation
- We talked about hypothesis testing via simulation on the whiteboard
- Review the "Hypothesis testing" section of the statistical inference Rmarkdown file (preview the output here)
- See notes here
- Read Chapter 2 of Intro to Stat with Randomization and Simulation (ISRS)
- Do the following exercises in Section 2.9 of ISRS: 2.2, 2.5, 2.21, 2.23
- Read Sections 3.1 and 3.2 of ISRS
- Do these two problems: