Hello Team, we're following the instructions <here> to replace the deprecated `Deployment.build_from...

Yufei Li

07/30/2024, 8:25 PM

Hello Team, we're following the instructions here to replace the deprecated

Deployment.build_from_flow

with

flow.deploy

. Previously we were using

path

parameter in

Deployment

class which indicates the path to the working directory for the workflow, relative to remote storage or, if stored on a local filesystem, an absolute path. However, there seems no equivalent parameter in the

flow.deploy

, as the only way seems to be using

flow.from_source (source = ..., entrypoint = ...).deploy(...)

. We wanna use an absolute path for local development since it's stored on a local filesystem, instead of pulling from the repo every time. Are there any way to do that?

Nate

07/30/2024, 8:36 PM

hi @Yufei Li https://docs-3.prefect.io/3.0rc/resources/upgrade-agents-to-workers#deploying-from-a-local-file

Nate

07/30/2024, 8:37 PM

these docs are for prefect 3.x, but we've backported the necessary code to 2.x so it should work on newest 2.x as well

Yufei Li

07/30/2024, 9:57 PM

Hey @Nate thanks for the doc! I'm following that example and here's my deployment code:

Copy code

deployment_uuid = flow.from_source(source="my local path", # here's my local path
        entrypoint="the entry point").deploy(
        name                = deployment_name,
        schedules           = schedules,
        is_schedule_active  = is_schedule_active,
        parameters          = {"job_uuid": job_uuid, "configuration_values": ""}, 
        work_pool_name      = work_pool_name,
        tags                = deployment_tags
    )

However, it's saying

RuntimeError: Failed to pull contents from remote storage 'my local path' to PosixPath('/var/folders/82/79sjhms94nv46hg57crphffm0000gp/T/tmpnj0twk3m/my local path')

Nate

07/30/2024, 9:59 PM

i suspect there's some more relevant logs just above where you found this

RuntimeError: Failed to pull contents from remote storage 'my local path' to PosixPath('/var/folders/82/79sjhms94nv46hg57crphffm0000gp/T/tmpnj0twk3m/my local path')

can you share more of the trace?

Yufei Li

07/31/2024, 2:08 PM

Hey @Nate sure here's the trace info. Here's the local path

'/Users/yufei.li/tangocard/prefect/prefect/'

I'm using to store the flows. here's the code that failed

Copy code

deployment_uuid = flow.from_source(source="/Users/yufei.li/tangocard/prefect/prefect/",
        entrypoint=deployment_entrypoint.entrypoint).deploy(
        name                = deployment_name,
        schedules           = schedules,
        is_schedule_active  = is_schedule_active,
        parameters          = {"job_uuid": job_uuid, "configuration_values": ""}, # Override
        work_pool_name      = work_pool_name,
        tags                = deployment_tags
    )

Untitled

Nate

07/31/2024, 2:10 PM

Copy code

FileNotFoundError: ['/Users/yufei.li/tangocard/prefect/prefect/Users/yufei.li/tangocard/prefect/prefect']

it looks like there's an absolute path where there should be a relative one? or somehow the path has gotten doubled can you show what you're giving as

source

and

entrypoint

? i suspect one / both of those values are the problem

Yufei Li

07/31/2024, 3:43 PM

Hey @Nate I was able to figure that out by using

Copy code

deployment = RunnerDeployment.from_entrypoint(
        name                = deployment_name,
        schedules           = schedules,
        is_schedule_active  = is_schedule_active,
        entrypoint          = deployment_entrypoint.entrypoint,
        parameters          = {"job_uuid": job_uuid, "configuration_values": ""}, # Override
        work_pool_name      = work_pool_name,
        tags                = deployment_tags)
deployment_uuid = deployment.apply()

just wanna make sure that the

RunnerDeployment

class wouldn't be deprecated from Sep, 2024, right?

Nate

07/31/2024, 3:53 PM

interesting. i wouldn't think its necessary to use that, i would think that

from_source

should work, but to answer the question, no we have no plans to deprecate

RunnerDeployment

that said, in my opinion it might be worthwhile figuring out what was wrong with your use of

from_source

instead of using a lower level util that we don't necessarily intend for direct (public) use in creating deployments

👍 1

Yufei Li

07/31/2024, 4:21 PM

Thank you for the info! I'll send an email to prefect for that

from_source

issue. Once that's figured out, we'll go back to

flow.from_source (source = ..., entrypoint = ...).deploy(...)

solution.

Nate

07/31/2024, 4:22 PM

please feel free to open an issue here

Yufei Li

07/31/2024, 4:31 PM

will do. Thank you!

Nathan Low

07/31/2024, 9:30 PM

Are there benefits to doing local code deployments this way vs .serve? I ended up going from build from flow and apply to that, and it’s working ok, but some code is taking a very long time to start up

Nate

07/31/2024, 9:52 PM

serve

is definitely not a direct replacement for

build_from_flow

.deploy

is the successor to

build_from_flow

•

serve

circumvents the need for a worker and makes the deployment ready to run now, its like Deployment + process worker together •

.deploy

(like

build_from_flow

) will create the deployment for some worker to run it later

Nate

07/31/2024, 9:53 PM

https://docs-3.prefect.io/3.0rc/resources/upgrade-agents-to-workers

Nathan Low

07/31/2024, 9:55 PM

So if I use serve, how does it download the code to run? I’m seeing times of 8 minutes on a serve where I’m serving one deployment inside the same folder. Any way to track what is happening there?

Nathan Low

07/31/2024, 9:59 PM

This green box here is why I figured that serve was the better way to go, is that not the case?

Nate

07/31/2024, 11:23 PM

can you explain what you mean by 8 minutes? the serve process will go forever until it’s killed because it starts listening right away, unlike .deploy (which just creates the deployment). something is wrong if it’s waiting to run a scheduled run it picked up for 8 minutes those docs could be misleading bc there was never anything like .serve before it. everything else just created deployments that you’d run with some other dedicated process later

Nathan Low

08/01/2024, 1:34 PM

Hi Nate, thanks for taking a look, I mean that the step “Downloading flow code from storage at ‘.’ Is taking 5-8 minutes regularly for a flow being served where it is the only flow being served. Here’s an example, I have to use my phone as work doesn’t let me use slack. Is there any way to see what is happening in those 7-8 minutes? Any way to get that amount of time to come down?

Nathan Low

08/01/2024, 4:26 PM

When I set the logging to go into debug mode, all I see is importing flow code from my main flow entry point. Any other way to see what it is doing? Any way to reduce the wait time? With agents this was taking about 1 minute to load the code from local, so 8 minutes is a big jump.

Nate

08/01/2024, 4:46 PM

yeah i have never seen it take anywhere near that long to pull flow code, something odd is going on is there anything out of the ordinary with your repo or setup? e.g. massive repo?

Nathan Low

08/01/2024, 4:48 PM

There’s about 16 files across two folders, the total size of the whole folder is about 800 kb. I know this copies to a different location, but that shouldn’t take 8 minutes for 800 kb, any way to track this down better?

Nate

08/01/2024, 4:51 PM

hrm yeah that seems very ordinary in terms of size. without more information its hard to say why, are you able to share your file where you flow is defined? perhaps as a gist or something since you said you have to use your phone?

Nathan Low

08/01/2024, 4:53 PM

I could zip it up and send it to an email address later? Mostly pandas work, extracts and loads to sql server, pretty boring stuff.

👍 1

Nate

08/01/2024, 5:04 PM

i asked mostly because I suspect there's some small detail thats causing weirdness, less because of the high level purpose of the flow

Nathan Low

08/01/2024, 5:07 PM

My worry is that maybe a library I’m including or that I’m importing several files in the initial flow file could be an issue, but it’s hard to track down as there’s no way to monitor what its trying to load along the way. Is there a good email address I could send the code to? I’m out at the moment but should be able to send it in an hour or so.

Nate

08/01/2024, 5:08 PM

will DM!

👍 1

63 Views

Open in Slack

Previous Next

Prefect Community

Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.