Welcome to the ASCI Computer Vision by Learning course!
UPDATE (May 11): To reduce the workload, we decided that each group only has to do Practical 1-3, and choose between Practical 4 or Practical 5, but don’t require to do both. Each group can individually choose one of the two advanced practicals based on their interests, and are allowed to skip the other practical.
This website provides you access to the content that we will use for the practical session at the ASCI course “Computer Vision by Learning”. In the course, you will conduct 5 practicals to different topics in the domain of machine and deep learning for Computer Vision, with increasing complexity to recent research topics in the field. Throughout the course, you will be guided by Teaching Assistant that help you with any question you may have. The teaching team consists of: Adeel Pervez, David Knigge, David Zhang, Tao Hu, Yunhua Zhang, Zenglin Shi, and Phillip Lippe.
The practicals require familiarity with the Deep Learning framework “PyTorch” and common scientific computing packages of Python such as numpy. If you are not familiar with PyTorch or would like to have a refresher, please check out our tutorial Introduction to PyTorch before the course. Additionally, the practicals will require you to train several deep learning models. While we tried to reduce the computational cost of each training as much as possible, you will require access to a GPU in order to finish the practicals. For more details, please check the instruction in the section How to run the notebooks, and make sure to have your preferred solution setup and ready to go at the start of the course.
The practicals are intended to be solved in groups of 2 students. You can already find team mates before the course starts if you know fellow students, or form the groups during the first practicals. Once you have formed groups, please sign up your group in this Google spreadsheet.
For any remaining questions regarding the practicals, please contact us at email@example.com.
Monday, 9. May 2022
Tuesday, 10. May 2022
Practical 1: Multi-Layer Perceptrons
Practical 2: Convolutional Neural Networks
Practical 3: Vision Transformers
Wednesday, 11. May 2022
Practical 4: Regular Group Convolutions
Thursday, 12. May 2022
Practical 5: Self-Supervised Contrastive Learning
The 5 practicals are aligned with the lectures of each day. For the first two days, it is expected that you finish the first three practicals. Ideally, you would finish Practical 1 and start with Practical 2 on Monday, and finish Practical 2 as well as Practical 3 on Tuesday. For the remaining two days, one practical per day is scheduled which aligns with the first lecture of each day. On each day, we have a practical session in the Hotel Casa from 13.30-17.00, where TAs will be in the room to answer your questions.
How to run the notebooks
On this website, you will find the notebooks exported into a HTML format so that you can read them from whatever device you prefer. Your task is to fill in the notebooks to solve the practicals. There are three main ways of running the notebooks we recommend:
Locally on GPU: If you have a laptop with a build-in NVIDIA GPU, we recommend that you run the practicals on your own machine. All notebooks are stored on the github repository that also builds this website. You can find them here: https://github.com/phlippe/asci_cbl_practicals/. While Practical 1 can be executed on common laptops without a GPU, the later practicals require access to a GPU to keep the training times in a reasonable range. Nonetheless, if you prefer, you can code and test most of your code on a CPU-only system, i.e. your own laptop, and once your code is tested and ready, use one of the remaining options to train the model. To ensure that you have all the right python packages installed, we provide a conda environment in the same repository.
Google Colab: If you do not have access to a GPU on your local machine, you can make use of Google Colab. Google Colab provides you access to GPUs for free, and you can activate the GPU support by
Runtime -> Change runtime type -> Hardware accelerator: GPU. Each notebook on this documentation website has a badge with a link to directly open it on Google Colab. It is highly recommend to copy the notebook to your own Google Drive before starting, since when closing the session, changes might be lost if you don’t save it to your local computer or have copied the notebook to your Google Drive beforehand. In addition, note that for free account, Google Colab is limited to one session at a time, and each session has a time limit.
Compute cluster: If you have access to a compute cluster, we recommend to using it to do your final trainings. Depending on your preference, you can implement the practicals either locally or on Google Colab. Once your notebook is ready, you can first convert the notebooks to a script using
jupyter nbconvert --to script ...ipynb, and then start a job on the cluster for running the script. A few advices when running on clusters:
Disable the tqdm statements in the notebook. Otherwise your slurm output file might overflow and be several MB large.
Comment out the matplotlib plotting statements, or change
At the end of the course, you are expected to submit a report about your findings of your practicals. The report should be prepared as a PDF in the LaTeX template provided in the repository at report/. In the report, you are expected to answer the questions in the practicals, and include any figures requested in the practicals (e.g. the training and/or evaluation curves of a model). Your report should have roughly 1 page per practical, with a maximum of 8 pages. For submission, add your report to your filled notebooks as a zip file and send it to firstname.lastname@example.org. Please make sure to clear all outputs of the notebooks before submitting, and exclude any datasets or trained models from your submission.
Please remember that we do not allow any form of plagiarism. Any plagiarism will lead to all team members failing the course.
- Tutorial: Introduction to PyTorch
- Practical 1: Multi-Layer Perceptrons
- Practical 2: Convolutional Neural Networks
- Practical 3: Vision Transformers
- Practical 4: Regular Group Convolutions
- Practical 5: Self-Supervised Contrastive Learning with SimCLR