Skip to content

Latest commit

 

History

History
70 lines (45 loc) · 2.04 KB

README.md

File metadata and controls

70 lines (45 loc) · 2.04 KB

app-rating-predictor

SC1015 Data Science Project Group Members: Aaron, Ivan, Yifei

Project Plan

Marking Rubrics

  1. 10% for coming up with your own problem definition based on a dataset

  2. 10% for data preparation and cleaning to suit the problem of your choice

  3. 20% for exploratory data analysis/visualization to gather relevant insights

  4. 20% for the use of machine learning techniques to solve specific problem

  5. 20% for the presentation of data-driven insights and the recommendations

  6. 10% for the quality of your final team presentation and overall impressions

  7. 10% for learning something new and doing something beyond this course

Problem statement

  1. Machine learning main goal: predict rating of the app using features.
  2. Which genre of apps has the highest rating?
  3. Which country makes the best apps?

Dev

  • Best developers and their top categories.
  • Developers that made the most apps.

Yifei

  • Does editor's choice affect ratings and installs?
  • Does size of the app affect total installs? (some people don't like to install large apps)
  • Does content rating, price(free / paid), ad supported apps, in app purchases has impact on the rating?
  • Does Day since updated affect rating
  • Does day since released affect installs (find apps that has less installs despite released for a long time)
  • How to get "High" Rating on Play Store?

Aaron

  • App rating distribution
  • Top singapore company apps?
  • Rating VS size/price/installs
  • Number of reviews VS Number of downloads
  • FANG, which company made the best apps?
  • How to price your app?

Data preparation and cleaning

Dataset link: google-playstore-apps

Scrapper folder ./google-play-scrapper

Exploration data analysis

Machine learning techniques

Initial Goal: Predicting the rating of the app (Numerical)

Linear Regression

Actual Goal: Predicting the market size of the app (Categorical)

Classification

Presentation of data driven insights

Streamlit link