Sean Talia
10/13/2021, 1:51 AMnumpy
, pandas
, snowflake-connector-python
). Then our users will go and write a handful of their own custom Python classes and modules that they need for their flow. In order to make these custom modules available for use in their flows, people have been creating slight variations of the same docker image that have that same set of python packages installed in it, and then just COPY
their project's code into the image – at that point, their RunConfig image has everything they need in it to run their flow.
One of the issues I'm foreseeing with this approach is that it's going to lead to a lot of image bloat in terms of the number of images we'll have in use across our flows – images whose Dockerfiles might be found across several different repositories – so we'll be maintaining a lot of images that hardly differ from one another save for a handful of custom Python modules that people copy into them. I'm trying to see if there's an approach that avoids this – or at least avoids it in a way that has a favorable tradeoff. Maybe instead of these custom modules needing to be available at registration/build time, they can simply be retrieved at runtime from S3, for example? If that were possible, the management overhead now moves to S3 rather than our image repository, but I think that's easier to deal with; plus many of our users who need/want to build these flows don't necessarily want to be in the business in building and managing Docker images.Sam Cook
10/13/2021, 4:19 AMSam Cook
10/13/2021, 4:21 AMKevin Kho
Sean Talia
10/13/2021, 2:23 PMfrom prefect import Flow
from prefect.storage import S3
flow = Flow("s3-flow", storage=S3(bucket="<my-bucket>"))
flow.storage.build()
whenever we define our flows, we always define the flow body as well, and it's the flow body that's going to need to use these lightweight custom modules that users are authoringChris L.
10/14/2021, 1:48 AMjob_template.yaml
can clarify the approach outlined above? https://prefect-community.slack.com/archives/CL09KU1K7/p1634175625338700?thread_ts=1634016930.185300&cid=CL09KU1K7