SC1015 Data Science Project Group Members: Aaron, Ivan, Yifei
-
10% for coming up with your own problem definition based on a dataset
-
10% for data preparation and cleaning to suit the problem of your choice
-
20% for exploratory data analysis/visualization to gather relevant insights
-
20% for the use of machine learning techniques to solve specific problem
-
20% for the presentation of data-driven insights and the recommendations
-
10% for the quality of your final team presentation and overall impressions
-
10% for learning something new and doing something beyond this course
- Machine learning main goal: predict rating of the app using features.
- Which genre of apps has the highest rating?
- Which country makes the best apps?
Dev
- Best developers and their top categories.
- Developers that made the most apps.
Yifei
- Does editor's choice affect ratings and installs?
- Does size of the app affect total installs? (some people don't like to install large apps)
- Does content rating, price(free / paid), ad supported apps, in app purchases has impact on the rating?
- Does Day since updated affect rating
- Does day since released affect installs (find apps that has less installs despite released for a long time)
- How to get "High" Rating on Play Store?
Aaron
- App rating distribution
- Top singapore company apps?
- Rating VS size/price/installs
- Number of reviews VS Number of downloads
- FANG, which company made the best apps?
- How to price your app?
Dataset link: google-playstore-apps
Scrapper folder ./google-play-scrapper
Initial Goal: Predicting the rating of the app (Numerical)
Linear Regression
Actual Goal: Predicting the market size of the app (Categorical)
Classification
Streamlit link