Skip to content

Commit

Permalink
Updates for 2023 course edition and update Lisa guide to Snellius (ph…
Browse files Browse the repository at this point in the history
…lippe#122)

* Updates for 2023 course edition and update Lisa guide to Snellius

* Add note about warning when loading Anaconda module on Snellius

* Incorporate Phillip's comments

* Resetting Tutorial 2 and making manual changes

* Tutorial 2 - Fixing other PyTorch versions

* Updating package versions in environment files

---------

Co-authored-by: Phillip Lippe <[email protected]>
  • Loading branch information
ddgoede and phlippe committed Oct 30, 2023
1 parent 662890a commit 1869d52
Show file tree
Hide file tree
Showing 11 changed files with 156 additions and 169 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,11 @@ How to run the notebooks

On this website, you will find the notebooks exported into a HTML format so that you can read them from whatever device you prefer. However, we suggest that you also give them a try and run them yourself. There are three main ways of running the notebooks we recommend:

- **Locally on CPU**: All notebooks are stored on the github repository that also builds this website. You can find them here: https://github.com/phlippe/uvadlc_notebooks/tree/master/docs/tutorial_notebooks. The notebooks are designed so that you can execute them on common laptops without the necessity of a GPU. We provide pretrained models that are automatically downloaded when running the notebooks, or can manually be downloaded from this [Google Drive](https://drive.google.com/drive/folders/1SevzqrkhHPAifKEHo-gi7J-dVxifvs4c?usp=sharing). The required disk space for the pretrained models and datasets is less than 1GB. To ensure that you have all the right python packages installed, we provide a conda environment in the [same repository](https://github.com/phlippe/uvadlc_notebooks/blob/master/) (choose the CPU or GPU version depending on your system).
- **Locally on CPU**: All notebooks are stored on the github repository that also builds this website. You can find them here: https://github.com/phlippe/uvadlc_notebooks/tree/master/docs/tutorial_notebooks. The notebooks are designed so that you can execute them on common laptops without the necessity of a GPU. We provide pretrained models that are automatically downloaded when running the notebooks, or can manually be downloaded from this [Google Drive](https://drive.google.com/drive/folders/1SevzqrkhHPAifKEHo-gi7J-dVxifvs4c?usp=sharing). The required disk space for the pretrained models and datasets is less than 1GB. To ensure that you have all the right python packages installed, we provide a conda environment in the [same repository](https://github.com/phlippe/uvadlc_notebooks/blob/master/) (choose the CPU or GPU version depending on your system).

- **Google Colab**: If you prefer to run the notebooks on a different platform than your own computer, or want to experiment with GPU support, we recommend using [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb#recent=true). Each notebook on this documentation website has a badge with a link to open it on Google Colab. Remember to enable GPU support before running the notebook (`Runtime -> Change runtime type`). Each notebook can be executed independently, and doesn't require you to connect your Google Drive or similar. However, when closing the session, changes might be lost if you don't save it to your local computer or have copied the notebook to your Google Drive beforehand.

- **Lisa cluster**: If you want to train your own (larger) neural networks based on the notebooks, you can make use of the Lisa cluster. However, this is only suggested if you really want to train a new model, and use the other two options to go through the discussion and analysis of the models. Lisa might not allow you with your student account to run Jupyter notebooks directly on the gpu_shared partition. Instead, you can first convert the notebooks to a script using `jupyter nbconvert --to script ...ipynb`, and then start a job on Lisa for running the script. A few advices when running on Lisa:
- **Snellius cluster**: If you want to train your own (larger) neural networks based on the notebooks, you can make use of the Snellius cluster. However, this is only suggested if you really want to train a new model, and use the other two options to go through the discussion and analysis of the models. Snellius might not allow you with your student account to run Jupyter notebooks directly on the gpu_shared partition. Instead, you can first convert the notebooks to a script using `jupyter nbconvert --to script ...ipynb`, and then start a job on Snellius for running the script. A few advices when running on Snellius:
- Disable the tqdm statements in the notebook. Otherwise your slurm output file might overflow and be several MB large. In PyTorch Lightning, you can do this by setting `progress_bar_refresh_rate=0` in the trainer.
- Comment out the matplotlib plotting statements, or change :code:`plt.show()` to `plt.savefig(...)`.

Expand All @@ -41,7 +41,7 @@ Tutorial-Lecture alignment

We will discuss 7 of the tutorials in the course, spread across lectures to cover something from every area. You can align the tutorials with the lectures based on their topics. The list of tutorials is:

- Guide 1: Working with the Lisa cluster
- Guide 1: Working with the Snellius cluster
- Tutorial 2: Introduction to PyTorch
- Tutorial 3: Activation functions
- Tutorial 4: Optimization and Initialization
Expand All @@ -50,7 +50,7 @@ We will discuss 7 of the tutorials in the course, spread across lectures to cove
- Tutorial 7: Graph Neural Networks
- Tutorial 8: Deep Energy Models
- Tutorial 9: Autoencoders
- Tutorial 10: Adversarial attacks
- Tutorial 10: Adversarial attacks
- Tutorial 11: Normalizing Flows on image modeling
- Tutorial 12: Autoregressive Image Modeling
- Tutorial 15: Vision Transformers
Expand Down
16 changes: 8 additions & 8 deletions dl2022_cpu.yml → dl2023_cpu.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
name: dl2021
name: dl2023
channels:
- pytorch
- conda-forge
- defaults
dependencies:
- python=3.10.6
- pip=22.2.2
- python=3.11.5
- pip=23.3.1
- cpuonly=2.0
- pytorch=1.13.0
- torchvision=0.14.0
- torchaudio=0.13.0
- pytorch=2.1.0
- torchvision=0.16.0
- torchaudio=2.1.0
- pip:
- pytorch-lightning==1.7.7
- tensorboard==2.10.1
- pytorch-lightning==2.1.0
- tensorboard==2.14.1
- tabulate>=0.8.9
- tqdm>=4.62.3
- pillow>=8.0.1
Expand Down
18 changes: 9 additions & 9 deletions dl2022_gpu.yml → dl2023_gpu.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
name: dl2022
name: dl2023
channels:
- pytorch
- nvidia
- conda-forge
- defaults
dependencies:
- python=3.10.6
- pip=22.2.2
- pytorch-cuda=11.7
- pytorch=1.13.0
- torchvision=0.14.0
- torchaudio=0.13.0
- python=3.11.5
- pip=23.3.1
- pytorch-cuda=11.8
- pytorch=2.1.0
- torchvision=0.16.0
- torchaudio=2.1.0
- pip:
- pytorch-lightning==1.7.7
- tensorboard==2.10.1
- pytorch-lightning==2.1.0
- tensorboard==2.14.1
- tabulate>=0.8.9
- tqdm>=4.62.3
- pillow>=8.0.1
Expand Down
2 changes: 1 addition & 1 deletion dl2022_jax_gpu.yml → dl2023_jax_gpu.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: dl2022_jax
name: dl2023_jax
channels:
- pytorch
- nvidia
Expand Down
26 changes: 13 additions & 13 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Welcome to the UvA Deep Learning Tutorials!
===========================================

| *Course website*: https://uvadlc.github.io/
| *Course edition*: DL1 - Fall 2022, DL2 - Spring 2022, Being kept up to date
| *Course edition*: DL1 - Fall 2023, DL2 - Spring 2023, Being kept up to date
| *Repository*: https://github.com/phlippe/uvadlc_notebooks
| *Recordings*: `YouTube Playlist <https://www.youtube.com/playlist?list=PLdlPlO1QhMiAkedeu0aJixfkknLRxk1nA>`_
| *Author*: Phillip Lippe
Expand All @@ -31,25 +31,25 @@ Further, the content presented will be relevant for the graded assignment and ex
The tutorials have been integrated as official tutorials of PyTorch Lightning.
Thus, you can also view them in `their documentation <https://pytorch-lightning.readthedocs.io/en/latest/>`_.

Schedule (Deep Learning 1, edition 2022)
Schedule (Deep Learning 1, edition 2023)
----------------------------------------

+------------------------------------------+---------------------------------------------------------------+
| **Date** | **Notebook** |
+------------------------------------------+---------------------------------------------------------------+
| Tuesday, 1. November 2022, 17.00-18.00 | Tutorial 2: Introduction to PyTorch |
| Tuesday, 31. October 2023, 15:00-16:00 | Tutorial 2: Introduction to PyTorch |
+------------------------------------------+---------------------------------------------------------------+
| Tuesday, 8. November 2022, 17.00-18.00 | Tutorial 3: Activation functions |
| Tuesday, 7. November 2023, 15:00-16:00 | Tutorial 3: Activation functions |
+------------------------------------------+---------------------------------------------------------------+
| Tuesday, 15. November 2022, 17.00-18.00 | Tutorial 4: Optimization and Initialization |
| Tuesday, 14. November 2023, 15:00-16:00 | Tutorial 4: Optimization and Initialization |
+------------------------------------------+---------------------------------------------------------------+
| Tuesday, 22. November 2022, 17.00-18.00 | Tutorial 5: Inception, ResNet and DenseNet |
| Tuesday, 21. November 2023, 15:00-16:00 | Tutorial 5: Inception, ResNet and DenseNet |
+------------------------------------------+---------------------------------------------------------------+
| Tuesday, 29. November 2022, 17.00-18.00 | Tutorial 6: Transformers and Multi-Head Attention |
| Tuesday, 28. November 2023, 15:00-16:00 | Tutorial 6: Transformers and Multi-Head Attention |
+------------------------------------------+---------------------------------------------------------------+
| Tuesday, 6. December 2022, 17.00-18.00 | Tutorial 7: Graph Neural Networks |
| Tuesday, 5. December 2023, 15:00-16:00 | Tutorial 7: Graph Neural Networks |
+------------------------------------------+---------------------------------------------------------------+
| Tuesday, 13. December 2022, 17.00-18.00 | Tutorial 17: Self-Supervised Contrastive Learning with SimCLR |
| Tuesday, 12. December 2023, 15:00-16:00 | Tutorial 17: Self-Supervised Contrastive Learning with SimCLR |
+------------------------------------------+---------------------------------------------------------------+

How to run the notebooks
Expand All @@ -62,7 +62,7 @@ However, we suggest that you also give them a try and run them yourself. There a

- **Google Colab**: If you prefer to run the notebooks on a different platform than your own computer, or want to experiment with GPU support, we recommend using `Google Colab <https://colab.research.google.com/notebooks/intro.ipynb#recent=true>`_. Each notebook on this documentation website has a badge with a link to open it on Google Colab. Remember to enable GPU support before running the notebook (:code:`Runtime -> Change runtime type`). Each notebook can be executed independently, and doesn't require you to connect your Google Drive or similar. However, when closing the session, changes might be lost if you don't save it to your local computer or have copied the notebook to your Google Drive beforehand.

- **Lisa cluster**: If you want to train your own (larger) neural networks based on the notebooks, you can make use of the Lisa cluster. However, this is only suggested if you really want to train a new model, and use the other two options to go through the discussion and analysis of the models. Lisa might not allow you with your student account to run Jupyter notebooks directly on the gpu_shared partition. Instead, you can first convert the notebooks to a script using :code:`jupyter nbconvert --to script ...ipynb`, and then start a job on Lisa for running the script. A few advices when running on Lisa:
- **Snellius cluster**: If you want to train your own (larger) neural networks based on the notebooks, you can make use of the Snellius cluster. However, this is only suggested if you really want to train a new model, and use the other two options to go through the discussion and analysis of the models. Snellius might not allow you with your student account to run Jupyter notebooks directly on the gpu partition. Instead, you can first convert the notebooks to a script using :code:`jupyter nbconvert --to script ...ipynb`, and then start a job on Snellius for running the script. A few advices when running on Snellius:

- Disable the tqdm statements in the notebook. Otherwise your slurm output file might overflow and be several MB large. In PyTorch Lightning, you can do this by setting :code:`enable_progress_bar=False` in the trainer.
- Comment out the matplotlib plotting statements, or change :code:`plt.show()` to :code:`plt.savefig(...)`.
Expand All @@ -72,7 +72,7 @@ Tutorial-Lecture alignment

We will discuss 7 of the tutorials in the course, spread across lectures to cover something from every area. You can align the tutorials with the lectures based on their topics. The list of tutorials in the Deep Learning 1 course is:

- Guide 1: Working with the Lisa cluster
- Guide 1: Working with the Snellius cluster
- Tutorial 2: Introduction to PyTorch
- Tutorial 3: Activation functions
- Tutorial 4: Optimization and Initialization
Expand All @@ -96,10 +96,10 @@ This is the first time we present these tutorials during the Deep Learning cours

If you find the tutorials helpful and would like to cite them, you can use the following bibtex::

@misc{lippe2022uvadlc,
@misc{lippe2023uvadlc,
title = {{UvA Deep Learning Tutorials}},
author = {Phillip Lippe},
year = 2022,
year = 2023,
howpublished = {\url{https://uvadlc-notebooks.readthedocs.io/en/latest/}}
}

Expand Down
6 changes: 3 additions & 3 deletions docs/tutorial_notebooks/guide2/Research_Projects.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,9 @@
"\n",
"### Reproducibility\n",
"\n",
"* Everything is about reproducibility. Make sure you can reproduce any training you do with the same random values, batches, etc. You will come to a point where you have tried a lot of different approaches, but none were able to improve upon one of your previous runs. When you try to run the model again with the best hyperparameters, you don't want to have a bad surprise (believe me, enough people have this issue, and it might also happen to you). Hence, before starting any grid search, make sure you are able to reproduce runs. Run two jobs in parallel on Lisa with the same hyperparams, seeds, etc., and if you don't get the exact same results, stop and try to fix it before anything else.\n",
"* Everything is about reproducibility. Make sure you can reproduce any training you do with the same random values, batches, etc. You will come to a point where you have tried a lot of different approaches, but none were able to improve upon one of your previous runs. When you try to run the model again with the best hyperparameters, you don't want to have a bad surprise (believe me, enough people have this issue, and it might also happen to you). Hence, before starting any grid search, make sure you are able to reproduce runs. Run two jobs in parallel on Snellius with the same hyperparams, seeds, etc., and if you don't get the exact same results, stop and try to fix it before anything else.\n",
"* Another fact about reproducibility is that saving and loading a model works without any problems. Make sure before a long training that you are able to load a saved model from the disk, and achieve the exact same test score as you had during training.\n",
"* Print your hyperparameters into the SLURM output file (simple print statement in python). This will help you identifying the runs, and you can easily check whether Lisa executes the job you intended to. Further, hyperparameters should be stored in a separate file in your checkpoint directory, whether saved by PyTorch Lightning or yourself.\n",
"* Snellius executes the job you intended to. Further, hyperparameters should be stored in a separate file in your checkpoint directory, whether saved by PyTorch Lightning or yourself.\n",
"* When running a job, copy the job file automatically to your checkpoint folder. This improves reproducibility by ensuring you have the exact running comment ready.\n",
"* Besides the slurm output file, create a output file in which you store the best training, validation and test score. This helps you when you want to quickly compare multiple models or create statistics of your results.\n",
"* If you want to be on the safe side and use git, you can even print/save the hash of the git commit you are currently on, and any changes you had made to the files. An example of how to do this can be found [here](https://github.com/Nithin-Holla/meme_challenge/blob/f4dc2079acb78ae30caaa31e112c4c210f93bf27/utils/save.py#L26).\n",
Expand Down Expand Up @@ -117,7 +117,7 @@
"\n",
"### Grid search with SLURM \n",
"\n",
"* SLURM supports you to do a grid search with [job arrays](https://help.rc.ufl.edu/doc/SLURM_Job_Arrays). We have discussed job arrays in the [Lisa guide](https://uvadlc-notebooks.readthedocs.io/en/latest/common/tutorial_notebooks/tutorial1/Lisa_Cluster.html#Job-Arrays).\n",
"* SLURM supports you to do a grid search with [job arrays](https://help.rc.ufl.edu/doc/SLURM_Job_Arrays). We have discussed job arrays in the [Snellius guide](https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial1/Lisa_Cluster.html#Job-Arrays).\n",
"* Job arrays allow you to start N jobs in parallel, each running with slightly different settings.\n",
"* It is effectively the same as creating N job files and calling N times `sbatch ...`, but this can become annoying and is messy at some point."
]
Expand Down
Loading

0 comments on commit 1869d52

Please sign in to comment.