Skip to content

athitten/DeepLearningExamples

 
 

Repository files navigation

NVIDIA Deep Learning Examples for Tensor Cores

Introduction

This repository provides State-of-the-Art Deep Learning examples that are easy to train and deploy, achieving the best reproducible convergence and performance with NVIDIA CUDA-X software stack running on NVIDIA Volta, Turing and Ampere GPUs.

NVIDIA GPU Cloud (NGC) Container Registry

These examples, along with our NVIDIA deep learning software stack, are provided in a monthly updated Docker container on the NGC container registry (https://ngc.nvidia.com). These containers include:

  • The latest NVIDIA examples from this repository
  • The latest NVIDIA contributions shared upstream to the respective framework
  • The latest NVIDIA Deep Learning software libraries, such as cuDNN, NCCL, cuBLAS, etc. which have all been through a rigorous monthly quality assurance process to ensure that they provide the best possible performance
  • Monthly release notes for each of the NVIDIA optimized containers

Computer Vision

Models Framework DALI AMP Multi-GPU Multi-Node TensorRT ONNX Triton TF-TRT Notebook
Computer Vision
ResNet-50 v1.5 PyTorch Yes Yes Yes - - - - - -
ResNeXt101-32x4d PyTorch Yes Yes Yes - - - - - -
SE-ResNeXt101-32x4d PyTorch Yes Yes Yes - - - - - -
Mask R-CNN PyTorch N/A Yes Yes - - - - - Yes
SSD300 v1.1 PyTorch Yes Yes Yes - - - - - Yes
ResNet-50 v1.5 TensorFlow Yes Yes Yes - - - - - -
ResNeXt101-32x4d TensorFlow Yes Yes Yes - - - - - -
SE-ResNeXt101-32x4d TensorFlow Yes Yes Yes - - - - - -
Mask R-CNN TensorFlow N/A Yes Yes - - - - - -
SSD320 v1.2 TensorFlow N/A Yes Yes - - - - - Yes
U-Net Industrial TensorFlow N/A Yes Yes - Yes - - Yes Yes
U-Net Medical TensorFlow N/A Yes Yes - Yes - - Yes -
V-Net Medical TensorFlow N/A Yes Yes - Yes Yes - Yes -
U-Net Medical TensorFlow-2 N/A Yes Yes - Yes - - Yes -
Mask R-CNN TensorFlow-2 N/A Yes Yes - - - - - -
ResNet50 v1.5 MXNet Yes Yes Yes - - - - - -

Natural Language Processing

Models Framework DALI AMP Multi-GPU Multi-Node TensorRT ONNX Triton TF-TRT Notebook
BERT PyTorch N/A Yes Yes Yes - - Yes - -
Transformer-XL PyTorch N/A Yes Yes Yes - - - - -
GNMT v2 PyTorch N/A Yes Yes - - - - - -
Transformer PyTorch N/A Yes Yes - - - - - -
BERT TensorFlow N/A Yes Yes Yes Yes - Yes Yes Yes
BioBert TensorFlow N/A Yes Yes - - - - - Yes
Transformer-XL TensorFlow N/A Yes Yes - - - - - -
GNMT v2 TensorFlow N/A Yes Yes - - - - - -
Faster Transformer Tensorflow N/A - - - Yes - - - -
Transformer-XL TensorFlow N/A Yes Yes - - - - - -

Recommender Systems

Models Framework DALI AMP Multi-GPU Multi-Node TensorRT ONNX Triton TF-TRT Notebook
DLRM PyTorch N/A Yes Yes - - Yes Yes - Yes
Neural Collaborative Filtering PyTorch N/A Yes Yes - - - - - -
Wide and Deep TensorFlow N/A Yes Yes - - - - - -
Neural Collaborative Filtering TensorFlow N/A Yes Yes - - - - - -
Variational Autoencoder Collaborative Filtering TensorFlow N/A Yes Yes - - - - - -

Speech to Text

Models Framework DALI AMP Multi-GPU Multi-Node TensorRT ONNX Triton TF-TRT Notebook
Jasper PyTorch N/A Yes Yes - Yes Yes Yes - Yes
HMM Kaldi N/A - Yes - - - Yes - -

Text to Speech

Models Framework DALI AMP Multi-GPU Multi-Node TensorRT ONNX Triton TF-TRT Notebook
Tacotron 2 and WaveGlow PyTorch N/A Yes Yes - Yes Yes Yes - -
FastPitch PyTorch N/A Yes Yes - - - - - -

NVIDIA support

In each of the network READMEs, we indicate the level of support that will be provided. The range is from ongoing updates and improvements to a point-in-time release for thought leadership.

Feedback / Contributions

We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!

Known issues

In each of the network READMEs, we indicate any known issues and encourage the community to provide feedback.

About

Deep Learning Examples

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 40.3%
  • Jupyter Notebook 40.0%
  • Cuda 10.1%
  • C++ 6.2%
  • Shell 2.8%
  • CMake 0.3%
  • Other 0.3%