Hello everyone, I have a couple of questions regar...
# prefect-community
k
Hello everyone, I have a couple of questions regarding deployment. I am considering using Docker or GCS Storage for my flows and trying to understand what makes the most sense. 1. When I use either, do I still run the agent in a virtual environment that includes the imports used by my tasks? Perhaps one is different than the other in that aspect? 2. What is the usual way to register a flow? Do people just use an ad hoc script whenever they have a new flow or perhaps do some kind of diffing & registering what isn't there as part of a release process? When is re-registering a flow ever required? Perhaps when changing the schedule?
👀 1
n
Hi @Kostas Chalikias 1. If you're building an image with Docker, any non-Pip installable dependencies should be included in the
base_image
you use to build the Flow. With that, your agent won't need to have access to those dependencies, just the registry you use to store the Docker image. GCS, however, stores your Flow as bytes in your bucket of choice but does not serialize your dependencies; in this case, your agent would need to have access whether through virtual env or otherwise) to the dependencies you mention in your Flow. 2. I'll let the community chime in here but generally any flows that don't use file-based storage will need to be re-registered when any changes are made. Doing so will re-serialize the Flow and bump the version number. Some users have more complicated CI/CD pipelines that might be helpful, particularly if you have some specific questions
k
Thanks @nicholas, 1. makes a lot of sense, it's more or less what I suspected. On 2. - I find the situation somewhat confusing. Re-registering all the flows every time means bumping the version and hence losing the run history and re-enabling any flow schedules I had intentionally paused, which is super annoying and can be dangerous. How can I know which flows I need to re-register, ie the ones which changed? I am not sure I get what a flow changing even means in this context? If I build a new container but none of the flow metadata changed I would hope that I could just use a tag (e.g latest) to get the new code to run without any re-registration.
n
@Kostas Chalikias you shouldn't need to re-register all your flows every time you make a change, only the flow that has changed needs to be re-registered. In addition, run history won't be lost when re-registering the flow under the same name. If you're using Prefect Cloud, the run history will be fully aggregated on the flow page; this feature isn't yet available in Prefect Server, but previous versions of a flow are always available from the
Versions
tab of the Flow page.
k
@nicholas just curious what you count as a flow having changed. Is it the structure of the flow (# of tasks, dependencies etc) or the code it's running as well?
n
@Kostas Chalikias - if you're on Prefect version
0.12.1
or later, you won't need to re-register the flow if the metadata of the flow hasn't changed, so code changes within tasks don't require re-registration. This is not the case with prior versions.
👍 1