Hi I am looking into building pipelines that involve GPU wor Prefect Community #ask-community

Hi, I am looking into building pipelines that invo...

Kilian

02/01/2021, 4:43 PM

Hi, I am looking into building pipelines that involve GPU work on a very irregular basis that autoscaling is a must. In my case, dask is not really an option as it interferes with the multiprocessing of pytorchs dataloader and otherwise was rather unstable for my GPU workload. To make it simpler, I would like to start a flow that I know needs GPU on it's own instance and do the processing there. The nodes would need to spin up or down depending on current demand. Any pointers how to best achieve this? Currently, I see different possibilities, but not sure which one is best: • Spin up a new Agent on a GPU node before the flow is scheduled and then using tags • Somehow use KubernetesRun to request a GPU and let kubernetes handle the up and downscaling • Only use prefect to trigger a ECS job

Stephanie Kirmer

02/01/2021, 4:47 PM

I've used dask with the dataloader for multiprocessing- what trouble did you have there?

Kilian

02/01/2021, 4:53 PM

The issue was with daemonic processes not being able to spawn children.

Stephanie Kirmer

02/01/2021, 4:53 PM

Ah yeah, I know that issue- for me, setting this environment variable fixed it:

Copy code

DASK_DISTRIBUTED__WORKER__DAEMON=False

❤️ 2

Stephanie Kirmer

02/01/2021, 4:54 PM

That was a gnarly problem, very frustrating!

Hugo Shi

02/01/2021, 4:55 PM

one other thing is that after you do that - you should also pass a multiprocessing context into the data loader (because the dask multiprocessing context is different than what pytorch typically uses

Hugo Shi

02/01/2021, 4:56 PM

Something like this

Copy code

train_loader = torch.utils.data.DataLoader(
    whole_dataset, sampler=train_sampler, batch_size=batch_size, num_workers=num_workers, multiprocessing_context=mp.get_context('fork')
)

❤️ 1

Kilian

02/01/2021, 4:56 PM

Thanks! I will try this for sure! Any hints for scaling up without Dask? 🙂

Stephanie Kirmer

02/01/2021, 4:56 PM

Sorry, we're the dask people! 🙂 We also helped build a library that made parallel training with dask + pytorch a lot easier, if it's any help- dask-pytorch-ddp (it's on pypi)

Hugo Shi

02/01/2021, 4:58 PM

we being @Stephanie Kirmer and me, but other folks here may have other thoughts

Hugo Shi

02/01/2021, 4:59 PM

@Kilian are you talking about training or inference? Or something different alltogether?

Kilian

02/01/2021, 4:59 PM

thanks for the help anyway, maybe going this route will also help achieve the endgoal 🙂

Hugo Shi

02/01/2021, 5:04 PM

if you only need a single GPU instance (you don't need multiple GPU instances that need to communicate with each other) than prefect + ECS or k8s will probably do just fine (without dask)

Hugo Shi

02/01/2021, 5:05 PM

if you're trying to do parallel training with DDP then your PyTorch machines would need some way to coordinate (you need to figure out which one is the master and pass that information around)

Kilian

02/01/2021, 5:07 PM

In the end both, but one of them would already help. Yes, only single GPU instance is necessary. No parallel training. So for that I would use KubernetesRun, specify the need for GPU there, and let k8s handle up and downscale? Thank you for your help!

upvote 1

2 Views

Open in Slack

Previous Next