dex
07/11/2021, 8:26 AMflows.py
has my flow definition, and utils.py
hosts some number of helper functions. And I'm using Github storage for the flow. Since Github storage only specify the path of the flows.py
, it got module not found during execution. I wonder if Github storage does not support module? I can't seem to find a good descrption in the documentation. Thanks in advance if anyone can give me a pointer.emre
07/11/2021, 10:23 AMrun_config
. Things like picking DockerRun
with a docker image that has all the necessary modules present. Explanation below:emre
07/11/2021, 10:32 AMutils.sonme_fun()
within a task in your flow.
• Github Storage serializes your flow (and tasks) via cloudpickle
• cloudpickle stores utils.some_fun()
as, follows: In a module named utils
, there should be a function named some_fun
, call that function.
• Serialized flow is pushed to github (probably, I use other storages)
So, flow doesn't store external modules, It doesn't even store the prefect module, it depends on the modules present in the executing environmentdex
07/11/2021, 10:37 AMKevin Kho
Michael Duncan
07/14/2021, 4:40 PMflows.py
need to be in that same script? I was under the impression that the whole repo is cloned and we could import from modules defined in the repo (such as utils.py
).Kevin Kho
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.
Powered by