Head pose estimation is a machine learning project that aims to predict the orientation of a person's head. The project involves drawing the three rotational axes of the head (pitch, yaw, and roll) by predicting the corresponding angles for each axis using machine learning models.
The AFLW2000 dataset is used for this project which consists of 2000 images and 2000 matlab files which contain the three rotational axes of the head (pitch, yaw, and roll) for each image.
test_output.mp4
The solution to the problem involved the following steps:
1 - Getting data ready, which included downloading the dataset and unzipping it
2 - Extracting 2D Landmarks, this was done via MediaPipe's Face Mesh Model
3 - Extracting labels for each image
4 - Normalization, this was done by subtracting the detected landmarks from the nose point and dividng them by the distance between the forehead and the chin. The normalized landmarks represent our features.
5 - Model training, in this phase several model were trained by feeding them the featrues and labels.
6 - Model optimization, fine-tuning each model to get the least MSE and RMSE.