Hi I have specific problem and I can t find simple solution Prefect Community #prefect-cloud

Hi, I have specific problem and I can't find simpl...

Karol Wolski

01/10/2024, 2:32 PM

Hi, I have specific problem and I can't find simple solution for it. In Prefect cloud 2 I have Kubernetes Job block, and it's mounting X volumes. From time to time I have problem with some of them (eg. 1 is faulty, the rest is fine), as it's external resource from customer and I can't guarantee it's working 100% of time. I would like to add additional Flow to do some simple network checks for X servers before enabling main process that mounts all or part of the volumes. Is it possible to dynamically modify Kubernetes Job block? I would like to be able to run main flow with X volumes mounted, but also with X-1,X-2,etc. From what I know it won't be possible with one Flow, as it needs to have the same infrastructure defined, so it's not an option to modify it (maybe there is?). But is it possible with multiple Flows, where first flow is not mounting anything, it's just checking what is available, and based on that it will trigger another flow with only valid volumes passed.

Nate

01/10/2024, 4:04 PM

hi @Karol Wolski - are you familiar with

run_deployment

? from what I can get from your situation, it seems like you need to do something like • check the state of the world somehow • update a

KubernetesJob

block accordingly • run deployments using that job block is that accurate? if so, it would seem possible to have some dispatcher / parent flow that does some > simple network checks to decide how to edit the k8s job block (which you're free to

.load

, modify and

.save

as needed like any other block) and then call

run_deployment

as needed, where those deployments will be referencing the same (edited) kubernetes job block there could be some trickiness in there if you have many concurrent runs all using that job block, but let me know if you think something like that could work

Karol Wolski

01/10/2024, 4:08 PM

Thank you @Nate. And you are correct with 3 steps described. I was looking for some 'ad-hoc' definition of

KubernetesJob

, but what you suggested should be also fine for me. I can create dedicated block that will be manipulated by

simple network check

, and it will be used only by this Flow later. And if that's the only idea you have, I will probably try that

👍 1

Nate

01/10/2024, 4:14 PM

hmm - so if you did want to do

some 'ad-hoc' definition of
KubernetesJob

instead of editing, you could edit the deployment's infra block id itself, passing in an entirely new

KubernetesJob

block ID as a

DeploymentUpdate

(from prefect.client.schemas.actions), all of which could go in

client.update_deployment

(here) but to me that sounds maybe messier, since I am thinking you'd likely want to copy most of the config from the existing k8s block?

Karol Wolski

01/10/2024, 4:23 PM

yes, it seems slightly worse to me too. I need to modify only volumes, so hopefully it will be fairly simple to implement with first idea. I will let you know once I create it

Karol Wolski

01/11/2024, 1:10 PM

@Nate, could you help me with one thing? Generally this idea is great, I added json block to keep all configuration I need and based on that first flow is doing what I need, together with updating

KubernetesJob

block. But I don't know how to efficiently create some dependencies between flows. Currently I have setup with 1 parent flow, which triggers 2-3 subflows, and every subflow has its tasks. But with the same parent flow it's not creating new kubernetes Job, and all subflows are executed within the same one. What option is doable? Using 2 parent flows with some dependencies between them, or to have 2 separate deployments? In second case, how to call deployment B at the end of deployment A?

Nate

01/11/2024, 1:56 PM

i bet you could do something like this, i.e. fire off a

run_deployment(..., timeout=0)

in the

on_completion

hook of the upstream to trigger the downstream also, i was talking internally about your use case and someone else suggested that you could also configure 1 per deployment per unique

KubernetesJob

config (set of volumes it sounds like in your case) - that way, after your checks, you just have to trigger right one - and not worry about editing k8s job blocks either way, i think state hooks and/or events will be your friend here in avoiding large parent flows that are just babysitting things to establish dependencies

Karol Wolski

01/11/2024, 2:05 PM

also, i was talking internally about your use case and someone else suggested that you could also configure 1 per deployment per unique
KubernetesJob
config (set of volumes it sounds like in your case) - that way, after your checks, you just have to trigger right one - and not worry about editing k8s job blocks

That seems cool, and having multiple blocks is not a problem. However - currently I have 4 volumes, but in the future there might be much more, like 100. And creating 100 deployment is not great. Is it possible to have multiple flows within one deployment using different infra? That would be perfect solution for me. In the meantime I will try with run_deployment, thanks for that.

Nate

01/11/2024, 2:27 PM

Is it possible to have multiple flows within one deployment using different infra? That would be perfect solution for me.

this is something we'd like to enable very soon like, you have one deployment, and without fundamentally altering the deployment, you just create a flow run with a patched version of the infra config - which i agree would solve your use case nicely i will say that this will likely not be implemented with Infrastructure blocks in mind, rather with work pools. the switch from

KubernetesJob

infra block -> Kubernetes work pool is really not bad anecdotally, its just: • install the

prefect-worker

helm chart instead of the

prefect-agent

(same values.yaml i think) • create a k8s work pool (has all the same fields as KubernetesJob block) • use

prefect deploy myfile.py:myflow -n k8s-work-pool

(docs)

Karol Wolski

01/11/2024, 3:40 PM

ok, so for now I will use 2 separate deployments. And I managed to make it work, thank you very much for your help. I'm running 1st deployment

connectivity_check

with

KubernetesJob

block without volumes mounted. I have json with X servers and all necessary information about them, like IP,port,volume,volumeMount, and some properties. Once it is done, I have such code to run second deployment:

Copy code

infra = KubernetesJob.load("volume-block")
infra.job["spec"]["template"]["spec"]["volumes"] = volumes
infra.job["spec"]["template"]["spec"]["containers"][0]["volumeMounts"] = volumeMounts
infra.save("volume-block", overwrite=True)

run_deployment(name="parent-landing-flow/landing_deployment")

And 2nd deployment

main_deployment

is using block recently modified by 1st deployment. And it worked as expected, Flow in 2nd deployment has 3/4 volumes mounted, everything performed automatically. Now I can proceed with this and Prefect seems a little less unknown to me :)

2 Views

Open in Slack

Previous Next