Projects Overview

This repository contains two projects: a Text-to-Speech (TTS) project using Microsoft's SpeechT5 model and a YOLO Object Detector project using the YOLOv5 model.

Text-to-Speech Project

Description

This project demonstrates the use of the Microsoft SpeechT5 model for text-to-speech synthesis. It provides examples of generating speech from text using pre-trained models and speaker embeddings.

Requirements

torch
transformers
datasets
soundfile

Install the required packages using:

pip install torch transformers datasets soundfile

Usage

Initialize the TTS pipeline:
- Use the pipeline function from transformers to create a text-to-speech pipeline with the microsoft/speecht5_tts model.
Load speaker embeddings:
- Load speaker embeddings from the Matthijs/cmu-arctic-xvectors dataset.
Generate and save speech:
- Generate speech from text and save it as an audio file.

Example Commands

Initialize the pipeline and synthesizer.
Load the embeddings dataset and extract a specific embedding.
Generate speech and save it to a file.

License

This project is licensed under the MIT License.

YOLO Object Detector Project

Description

This project demonstrates the implementation of the YOLOv5 object detection model. It provides examples of loading the model, preprocessing images, performing object detection, and visualizing the results.

Requirements

torch
opencv-python
matplotlib
yolov5 (from the official YOLOv5 repository)

Install the required packages using:

pip install torch opencv-python matplotlib git+https://github.com/ultralytics/yolov5.git

Usage

Load the YOLOv5 model:
- Load the pre-trained YOLOv5 model using the torch.hub.load method.
Preprocess images:
- Preprocess input images to the required format for YOLOv5.
Perform object detection:
- Use the model to detect objects in the preprocessed images.
Draw bounding boxes and visualize results:
- Draw bounding boxes around detected objects and display/save the result.

Example Commands

Load and preprocess an image.
Perform object detection and apply non-max suppression.
Draw bounding boxes on detected objects and display/save the resulting image.

Contributing

George Youhana - [email protected]
Mostafa Magdy - [email protected]
Abdallah Alkhouly - [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
Text_to_Speech.ipynb		Text_to_Speech.ipynb
YOLO_Object_Detector&_Text_To_Speech_.ipynb		YOLO_Object_Detector&_Text_To_Speech_.ipynb
YOLO_Object_Detector_.ipynb		YOLO_Object_Detector_.ipynb
open cv.png		open cv.png
test result.png		test result.png
text to speech.png		text to speech.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Projects Overview

Text-to-Speech Project

Description

Requirements

Usage

Example Commands

License

YOLO Object Detector Project

Description

Requirements

Usage

Example Commands

Contributing

About

Releases

Packages

Languages

Geo-y20/Object-detection-and-text-to-speech

Folders and files

Latest commit

History

Repository files navigation

Projects Overview

Text-to-Speech Project

Description

Requirements

Usage

Example Commands

License

YOLO Object Detector Project

Description

Requirements

Usage

Example Commands

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages