Hotel Rating Prediction

This is a machine learning project that focuses on predicting hotel ratings based on various features and reviews. The goal is to build and evaluate different models to accurately predict the rating given by reviewers.

Project Overview

The project aims to predict hotel ratings based on a dataset containing various features and reviews. The dataset is preprocessed to handle missing values, encode categorical variables, and perform feature scaling. The project includes both classification and regression models to cover different aspects of rating prediction.

Data Preprocessing

The data preprocessing steps involve handling outliers, encoding categorical variables, extracting relevant information from tags, and scaling numerical features. Outliers in certain columns are winsorized to mitigate their impact on the models. Categorical variables are encoded, and tags are extracted and organized into categories. Numerical features are scaled using the Min-Max scaling technique to ensure consistency across different ranges.

Model Training and Evaluation

Logistic Regression

A logistic regression model is trained to predict hotel ratings. The top features with the highest correlation to the target variable are selected for training the model. The model's performance is evaluated using validation and test accuracy scores. The trained logistic regression model is saved for future use.

Decision Tree Classifier

A decision tree classifier is trained to predict hotel ratings. The same top features selected for logistic regression are used as input features. The model's performance is evaluated using validation and test accuracy scores. The trained decision tree classifier is saved for future use.

Random Forest Classifier

A random forest classifier is trained to predict hotel ratings. The model uses 150 estimators, a maximum depth of 25, and minimum samples per leaf of 75. The model's performance is evaluated using validation and test accuracy scores. The trained random forest classifier is saved for future use.

K-Nearest Neighbors Classifier

A k-nearest neighbors classifier is trained to predict hotel ratings. The model uses 21 neighbors for classification. The model's performance is evaluated using validation and test accuracy scores. The trained k-nearest neighbors classifier is saved for future use.

Regression Models

Linear Regression

A linear regression model is trained to predict hotel ratings. The mean squared error (MSE) is calculated for both the training and validation sets to evaluate the model's performance.

K-Nearest Neighbors Regression

A k-nearest neighbors regression model is trained to predict hotel ratings. The mean squared error (MSE) is calculated for both the training and validation sets to evaluate the model's performance.

Random Forest Regression

A random forest regression model is trained to predict hotel ratings. The mean squared error (MSE) is calculated for both the training and validation sets to evaluate the model's performance.

Gradient Boosting Regression

A gradient boosting regression model is trained to predict hotel ratings. The mean squared error (MSE) is calculated for both the training and validation sets to evaluate the model's performance.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Bank_Personal_loan.ipynb		Bank_Personal_loan.ipynb
README.md		README.md
phase1.py		phase1.py
phase2.py		phase2.py
test_script.py		test_script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hotel Rating Prediction

Table of Contents

Project Overview

Data Preprocessing

Model Training and Evaluation

Logistic Regression

Decision Tree Classifier

Random Forest Classifier

K-Nearest Neighbors Classifier

Regression Models

Linear Regression

K-Nearest Neighbors Regression

Random Forest Regression

Gradient Boosting Regression

About

Releases

Packages

Languages

lunary403/Hotel-Rating-Prediction-ML

Folders and files

Latest commit

History

Repository files navigation

Hotel Rating Prediction

Table of Contents

Project Overview

Data Preprocessing

Model Training and Evaluation

Logistic Regression

Decision Tree Classifier

Random Forest Classifier

K-Nearest Neighbors Classifier

Regression Models

Linear Regression

K-Nearest Neighbors Regression

Random Forest Regression

Gradient Boosting Regression

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages