This repository is made for my Mini Data Analysis project for UBC STAT 545A course 💻
In this milestone, we have the exploration of the datasets in the datateachr
package. More specifically, their relationships and variables are investigated.
Next, 1 final dataset (cancer_sample
) is chosen, and four potential research questions are proposed.
-
Data_Exploration.Rmd contains the R markdown file for milestone 1 of this data analysis project (can be run in RStudio.)
-
Data_Exploration.html contains the knitted html file for milestone 1 of this data analysis project.
-
The html file for Milestone1 can be more easily previewd using the following link:
- Data_Exploration.md contains the knitted .md file for milestone 1 of this data analysis project.
In this milestone, summarization and plots of the variables in the chosen dataset are investigated to answer the questions formulated in the previous phase. Next, the research questions are narrowed down to two. Finally, a more standard version of the dataset is produced.
-
Data_Analysis.Rmd contains the R markdown file for milestone 2 of this data analysis project (can be run in RStudio.)
-
Data_Analysis.html contains the knitted html file for milestone 2 of this data analysis project.
-
The html file for Milestone2 can be more easily previewd using the following link:
- Data_Analysis.md contains the knitted .md file for milestone 2 of this data analysis project.
In this milestone, more comprehensive plots are generated by manipulation of some of the dataset variables. Next, a learning model is fit to the dataset to get statistical results for one of the research questions. Finally, the learning model and a summarized table were saved.
-
Data_Analysis_2.Rmd contains the R markdown file for milestone 3 of this data analysis project (can be run in RStudio.)
-
Data_Analysis_2.html contains the knitted html file for milestone 3 of this data analysis project.
-
The html file for Milestone3 can be more easily previewd using the following link:
- Data_Analysis_2.md contains the knitted .md file for milestone 3 of this data analysis project.
This is the folder were some documents created from the Milestone 3 are stored:
- summ_table.csv contains a summary table obtained as a part of Milestone 2
- fitted_model.rds contains the fitted model on features to predict the target variable
- Make a new branch
- Pull the repository (you need to install R studio)
- Change any file
- Commit, and push the editted repository to GitHub
- Create a pull request