https://prefect.io logo
Title
s

shazard

04/09/2020, 2:28 PM
Hello there ! I am an engineer working for a small french company looking to improve - actually create - our data science workflows. I have been looking at prefect for a couple days now and it looks really good, especially with the open-sourced Core server. One feature we really need and I have not been able to read about in the documentation/GitHub Issues is running flows (or even tasks) on GPU instances such as EC2 ones. As Fargate does not support GPU instances at this time, I guess the Fargate Task Environment is not good enough. As a newbie it is entirely possible that I am misunderstanding how Agents and Environments work, but I was envisioning creating a custom Environment that requests instances from EC2 as a setup and runs the flow on it. Is that something that would be feasible ? Or maybe there is an easier way to do it ? Anyways, thank you for you work on this project !
j

josh

04/09/2020, 3:22 PM
Hey @shazard great question! While I do not have much experience in the GPU execution world I do know that you can absolutely create a custom environment to accomplish this. What would happen in your custom environment is (following the same interface we currently use) the
setup
function would either find/create your EC2 instance w/ the amount of GPU resources you would need and then the
execute
function would execute your flow using those resources. We welcome any contributions back to the engine if you happen to develop a GPU-based execution environment that others could also use 🙂
To possibly serve as an example you can see in the
FargateTaskEnvironment
that the setup step creates the task definition and the execute step runs the task (in your case it may be an EC2 instance). https://github.com/PrefectHQ/prefect/blob/master/src/prefect/environments/execution/fargate/fargate_task.py
s

shazard

04/09/2020, 3:29 PM
Thanks for the answer ! I'll have to check out what the EC2 api allows as far as running things like docker containers from python code goes. Might have to turn to SageMaker directly actually. I'll definitely try to include this kind of environment into the engine if I can manage to code something clean ! Thanks again !
j

Joe Schmid

04/09/2020, 9:47 PM
hi @shazard, we do exactly this, i.e. use GPUs on AWS with Prefect Flows where certain tasks are tagged with Dask resources to run specifically on GPU instances. We use AWS EKS (their kubernetes implementation) and the Dask Kubernetes project: https://docs.dask.org/en/latest/setup/kubernetes.html
🙂 1
Here's a thread with some additional info on using resource tags with Prefect Tasks: https://prefect-community.slack.com/archives/CL09KU1K7/p1579531435018200?thread_ts=1579527638.015400&cid=CL09KU1K7
s

shazard

04/10/2020, 6:06 AM
Thanks @Joe Schmid ! Will definitely check this out.