Skip to content

limivann/app-performance-predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

app-rating-predictor

SC1015 Data Science Project Group Members: Aaron, Ivan, Yifei

Project Plan

Marking Rubrics

  1. 10% for coming up with your own problem definition based on a dataset

  2. 10% for data preparation and cleaning to suit the problem of your choice

  3. 20% for exploratory data analysis/visualization to gather relevant insights

  4. 20% for the use of machine learning techniques to solve specific problem

  5. 20% for the presentation of data-driven insights and the recommendations

  6. 10% for the quality of your final team presentation and overall impressions

  7. 10% for learning something new and doing something beyond this course

Problem statement

  1. Machine learning main goal: predict rating of the app using features.
  2. Which genre of apps has the highest rating?
  3. Which country makes the best apps?

Dev Best developers and their top categories. Developers that made the most apps.

Yifei Does editor's choice affect ratings and installs? Does size of the app affect total installs? (some people don't like to install large apps) Does content rating, price(free / paid), ad supported apps has impact on the rating? Does in game purchases affect rating Does Day since updated affect rating Does day since released affect installs (find apps that has less installs despite released for a long time) How to get "High" Rating on Play Store?

Aaron FANG, which company made the best apps? Top singapore company apps? App rating distribution Rating VS size/price band/free or paid/installs Pricing trend, how to price your app? (Swarmplot) Number of reviews VS Number of downloads

Data preparation and cleaning

Dataset link: google-playstore-apps

Scrapper folder ./google-play-scrapper

Exploration data analysis

Machine learning techniques

Problem 1: Predicting the rating of the app (Numerical)

Linear Regression

Problem 2: Prediciting the app price (free/paid)

Classification

Problem 3: Predicting size of the app?

Presentation of data driven insights

Streamlit link