Stars
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Program to apply a psychoacoustic model onto an input WAV file.
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
Neural Network Video Interpolation / Super Resolution Filter for VapourSynth
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
📘《OpenCV3编程入门》书本配套源码 |《Introduction to OpenCV3 Programming》Book Source Code
Datasets and Code for "Beyond Scalar Neuron: Adopting Vector-Neuron Capsules for Long-Term Person Re-Identification" and "Celebrities-ReID: A Benchmark for Clothes Variation in Long-Term Person Re-…
[ICCV 2021] Code for approximated exponential maximum pooling
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
A latent text-to-image diffusion model
Random Erasing Data Augmentation. Experiments on CIFAR10, CIFAR100 and Fashion-MNIST
An out-of-box human parsing representation extractor.
Horizontal Pyramid Matching for Person Re-identification (AAAI 2019)
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification
Pytorch implementation of 'Clothes-Changing Person Re-identification with RGB Modality Only. In CVPR, 2022.'
武汉大学新教务系统成绩助手 | a GPA calculator of WHU new education administration system.