The purpose of this project is to create a Juypter Notebook that utilizes Polars to generate descriptive statistics. It establishes an environment on codespaces and uses Github Actions to run a Makefile for : make install, make test, make format, and make lint. It loads the Top Wealthiest data set from Kaggler and utiize a library to run functions that will generate a pie chart as well as some summary statistics about the data.
-
Makefile - needed for installation, formatting, testing and linting
-
Dockerfile - defines the environment ensuring this program will run the same for all devices
-
A base set of libraries - needed to run the functions
-
Group of format, install, lint and test .yml files to ensure continuous integration and continous development
-
A brand new library made specifically for this project
-
A Jupyter Notebook with all the code formatted separately for use.
The data we will be analyzing will be from Kaggle.com. We will be analyzing the top 1000 wealthiest people in the world. (https://www.kaggle.com/datasets/muhammadehsan02/top-1000-wealthiest-people-in-the-world)
https://youtu.be/kKU0yW3wxLg?si=BNhBrNWFR40EI4DD
See the sample chart below of percentages of wealth per each industry described in the dataset