Exoplanet Habitability

Classification of dataset of exoplanets( phl_hec_all_confirmed ) based on habitability

Problem Statement

The data in this model comes from the Planetary Habitability Laboratory’s expanded version of NASA’s catalogue of confirmed exoplanets. The problem statement included classifying the exoplanets into their respective classes of habitability (hypopsychroplanet, psychroplanet, mesoplanet and thermoplanet). The dataset included characteristics of each planet such as temperature, planet mass, planet radius, and the gravitational pull on the surface.

ML Concepts Employed

Data Pre-processing: Data pre-processing refers to the techniques of transforming noisy data i.e. data with inconsistent data values, into clean data that could be used efficiently in a classification model. Our implementation of data pre-processing included removal of attributes with 75% missing values, imputing of missing values with mean, median and mode values and encoding categorical attributes in the dataset.

Decision Tree: A Decision tree is a classification tool that is used to efficiently classify values in the datasets based on the determined splits in the attributes. We employed a decision tree model to help classify the exoplanets. The decision tree model takes more time than most models to complete its classification but also outputs classifications with exceptional accuracy, which is why we could obtain an accuracy of 99%

Confusion Matrix: Confusion matrices are used as accuracy measures to efficiently enumerate the correct and incorrectly predicted values from the model used. The values that are predicted correctly are labelled as True Positives and True negatives and they contribute towards the accuracy of the model.

Results

After running our Decision Tree model on the cleaned dataset, we could obtain a predicted set of values from the test split which showed a 99% accuracy at the output.

Figure 1: Confusion Matrix

Figure 2: Area-Under-Curve

Analysis

After careful analysis of the predicted values outputted from the model, we were in a position to make a few deductions regarding the dataset. Our decision tree allowed us to acheive a highly accurate classification of exoplanets.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitattributes		.gitattributes
AUC.png		AUC.png
Confusion_Matrix.png		Confusion_Matrix.png
Exoplanet_Habitability.ipynb		Exoplanet_Habitability.ipynb
README.md		README.md
phl_hec_all_confirmed.csv		phl_hec_all_confirmed.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exoplanet Habitability

Problem Statement

ML Concepts Employed

Results

Figure 1: Confusion Matrix

Figure 2: Area-Under-Curve

Analysis

About

Releases

Packages

Languages

adivar99/Exoplanet-Habitability

Folders and files

Latest commit

History

Repository files navigation

Exoplanet Habitability

Problem Statement

ML Concepts Employed

Results

Figure 1: Confusion Matrix

Figure 2: Area-Under-Curve

Analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages