Hi @Adam, the idempotency key is used for registration with the Prefect backend, your storage will still be built by default.
Zanie
02/25/2021, 3:55 PM
This is actually a desired behavior for some people, because the
serialized_hash
is a hash for the serialized flow which is just metadata. The code of one of your tasks could change slightly and the hash would not change. In this case, your storage would be updated but your flow version in Cloud would not be updated.
a
Adam
02/25/2021, 3:56 PM
Thanks @Zanie - is there any way to only build storage it if the flow has changed? Our CI process loops through all our flows and calls the above code. Currently have about 25 flows and it’s growing quite fast so want to avoid having to reupload every flow every time
z
Zanie
02/25/2021, 3:57 PM
I agree that it'd be useful if we could reduce expensive storage rebuilds using a hash/key check. I'll check with the rest of the team to see if there's a better pattern.
a
Adam
02/25/2021, 3:57 PM
Thanks!
z
Zanie
02/25/2021, 3:57 PM
I know a lot of people will use
git
to check if there are changes to any of their flow files
a
Adam
02/25/2021, 3:59 PM
Indeed. But hoping to avoid that as things get a bit complicated when everything is already committed to master. Have to start comparing commits etc
z
Zanie
02/25/2021, 4:02 PM
While I wait for someone to get back to me--we don't really think docker/gcs storage is a great pattern because of the build time here. Generally we'd recommend using a
DockerRun
(or in your case
KubernetesRun
) with a base image that has your shared code and storing your flows using a lighter storage (ie
S3
)
a
Adam
02/26/2021, 11:21 AM
Thanks @Zanie - that’s exactly what we’re doing though. A base Docker image with all the deps and shared code + KubernetesRun + GCS for flows (equivalent to S3)
Adam
02/26/2021, 11:22 AM
Ah wait, I see the confusion. By GCS I meant Google Cloud Storage (i.e. S3 on google) rather than Google Container Registry 😛
z
Zanie
02/26/2021, 2:41 PM
Oh I'm sorry! I forget the GCloud acronyms sometimes 🤦♂️
Zanie
02/26/2021, 2:41 PM
Is the upload slow enough to be concerning?
a
Adam
03/01/2021, 8:18 PM
It’s okay for now, about 3 seconds per flow. I think I’ll write some code to detect what changed - will need that anyway to conditionally trigger Docker rebuilds
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.