Skip to content

Accessing GPUs on the Daisi platform

Daisi offers an access to GPUs (NVIDIA Corporation GA100 [A100 PCIe 80GB] as of September 2022).

Making your code use the GPUS is straightforward. Simply write it as if you were using local GPUs.

In addition, it requires the following steps:

1. Make the GPUs visible to your Daisi

By default, GPUs are not visible. You need to override this setting with the environment variable "CUDA_VISIBLE_DEVICES" as follow:

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"

Note: each node of the Daisi cluster features two GPUS A100 PCIe with 80GB of RAM each.

2. For torch users, update your requirements file

If your code uses torch, the first two lines or your requirements file should be:

--find-links https://download.pytorch.org/whl/torch_stable.html
torch==1.12.0+cu116

And that's all! Check the deployment of Stable Diffusion on Daisi for a good example of accessing GPUs in the Daisi platform.

Cold vs warm start

The first execution of a Daisi requires to start the corresponding web service. In the case of a large ML model running on GPUs, loading the model in memory can take some time, up to 10s of seconds. So be patient. As soon as the model is loaded, following executions will be as fast as your code can run on GPUs.

If not used during 60 mins, the service will be shut down, meaning that the next execution will be again a cold start.