dex
07/11/2021, 8:26 AMflows.py
has my flow definition, and utils.py
hosts some number of helper functions. And I'm using Github storage for the flow. Since Github storage only specify the path of the flows.py
, it got module not found during execution. I wonder if Github storage does not support module? I can't seem to find a good descrption in the documentation. Thanks in advance if anyone can give me a pointer.emre
07/11/2021, 10:23 AMrun_config
. Things like picking DockerRun
with a docker image that has all the necessary modules present. Explanation below:emre
07/11/2021, 10:32 AMutils.sonme_fun()
within a task in your flow.
• Github Storage serializes your flow (and tasks) via cloudpickle
• cloudpickle stores utils.some_fun()
as, follows: In a module named utils
, there should be a function named some_fun
, call that function.
• Serialized flow is pushed to github (probably, I use other storages)
So, flow doesn't store external modules, It doesn't even store the prefect module, it depends on the modules present in the executing environmentdex
07/11/2021, 10:37 AMKevin Kho
Michael Duncan
07/14/2021, 4:40 PMflows.py
need to be in that same script? I was under the impression that the whole repo is cloned and we could import from modules defined in the repo (such as utils.py
).Kevin Kho