Hello Team, we're following the instructions <here> to replace the deprecated `Deployment.build_from...
y
Hello Team, we're following the instructions here to replace the deprecated
Deployment.build_from_flow
with
flow.deploy
. Previously we were using
path
parameter in
Deployment
class which indicates the path to the working directory for the workflow, relative to remote storage or, if stored on a local filesystem, an absolute path. However, there seems no equivalent parameter in the
flow.deploy
, as the only way seems to be using
flow.from_source (source = ..., entrypoint = ...).deploy(...)
. We wanna use an absolute path for local development since it's stored on a local filesystem, instead of pulling from the repo every time. Are there any way to do that?
these docs are for prefect 3.x, but we've backported the necessary code to 2.x so it should work on newest 2.x as well
y
Hey @Nate thanks for the doc! I'm following that example and here's my deployment code:
Copy code
deployment_uuid = flow.from_source(source="my local path", # here's my local path
        entrypoint="the entry point").deploy(
        name                = deployment_name,
        schedules           = schedules,
        is_schedule_active  = is_schedule_active,
        parameters          = {"job_uuid": job_uuid, "configuration_values": ""}, 
        work_pool_name      = work_pool_name,
        tags                = deployment_tags
    )
However, it's saying
RuntimeError: Failed to pull contents from remote storage 'my local path' to PosixPath('/var/folders/82/79sjhms94nv46hg57crphffm0000gp/T/tmpnj0twk3m/my local path')
n
i suspect there's some more relevant logs just above where you found this
RuntimeError: Failed to pull contents from remote storage 'my local path' to PosixPath('/var/folders/82/79sjhms94nv46hg57crphffm0000gp/T/tmpnj0twk3m/my local path')
can you share more of the trace?
y
Hey @Nate sure here's the trace info. Here's the local path
'/Users/yufei.li/tangocard/prefect/prefect/'
I'm using to store the flows. here's the code that failed
Copy code
deployment_uuid = flow.from_source(source="/Users/yufei.li/tangocard/prefect/prefect/",
        entrypoint=deployment_entrypoint.entrypoint).deploy(
        name                = deployment_name,
        schedules           = schedules,
        is_schedule_active  = is_schedule_active,
        parameters          = {"job_uuid": job_uuid, "configuration_values": ""}, # Override
        work_pool_name      = work_pool_name,
        tags                = deployment_tags
    )
n
Copy code
FileNotFoundError: ['/Users/yufei.li/tangocard/prefect/prefect/Users/yufei.li/tangocard/prefect/prefect']
it looks like there's an absolute path where there should be a relative one? or somehow the path has gotten doubled can you show what you're giving as
source
and
entrypoint
? i suspect one / both of those values are the problem
y
Hey @Nate I was able to figure that out by using
Copy code
deployment = RunnerDeployment.from_entrypoint(
        name                = deployment_name,
        schedules           = schedules,
        is_schedule_active  = is_schedule_active,
        entrypoint          = deployment_entrypoint.entrypoint,
        parameters          = {"job_uuid": job_uuid, "configuration_values": ""}, # Override
        work_pool_name      = work_pool_name,
        tags                = deployment_tags)
deployment_uuid = deployment.apply()
just wanna make sure that the
RunnerDeployment
class wouldn't be deprecated from Sep, 2024, right?
n
interesting. i wouldn't think its necessary to use that, i would think that
from_source
should work, but to answer the question, no we have no plans to deprecate
RunnerDeployment
that said, in my opinion it might be worthwhile figuring out what was wrong with your use of
from_source
instead of using a lower level util that we don't necessarily intend for direct (public) use in creating deployments
👍 1
y
Thank you for the info! I'll send an email to prefect for that
from_source
issue. Once that's figured out, we'll go back to
flow.from_source (source = ..., entrypoint = ...).deploy(...)
solution.
n
please feel free to open an issue here
y
will do. Thank you!
n
Are there benefits to doing local code deployments this way vs .serve? I ended up going from build from flow and apply to that, and it’s working ok, but some code is taking a very long time to start up
n
serve
is definitely not a direct replacement for
build_from_flow
,
.deploy
is the successor to
build_from_flow
serve
circumvents the need for a worker and makes the deployment ready to run now, its like Deployment + process worker together •
.deploy
(like
build_from_flow
) will create the deployment for some worker to run it later
n
So if I use serve, how does it download the code to run? I’m seeing times of 8 minutes on a serve where I’m serving one deployment inside the same folder. Any way to track what is happening there?
This green box here is why I figured that serve was the better way to go, is that not the case?
n
can you explain what you mean by 8 minutes? the serve process will go forever until it’s killed because it starts listening right away, unlike .deploy (which just creates the deployment). something is wrong if it’s waiting to run a scheduled run it picked up for 8 minutes those docs could be misleading bc there was never anything like .serve before it. everything else just created deployments that you’d run with some other dedicated process later
n
Hi Nate, thanks for taking a look, I mean that the step “Downloading flow code from storage at ‘.’ Is taking 5-8 minutes regularly for a flow being served where it is the only flow being served. Here’s an example, I have to use my phone as work doesn’t let me use slack. Is there any way to see what is happening in those 7-8 minutes? Any way to get that amount of time to come down?
When I set the logging to go into debug mode, all I see is importing flow code from my main flow entry point. Any other way to see what it is doing? Any way to reduce the wait time? With agents this was taking about 1 minute to load the code from local, so 8 minutes is a big jump.
n
yeah i have never seen it take anywhere near that long to pull flow code, something odd is going on is there anything out of the ordinary with your repo or setup? e.g. massive repo?
n
There’s about 16 files across two folders, the total size of the whole folder is about 800 kb. I know this copies to a different location, but that shouldn’t take 8 minutes for 800 kb, any way to track this down better?
n
hrm yeah that seems very ordinary in terms of size. without more information its hard to say why, are you able to share your file where you flow is defined? perhaps as a gist or something since you said you have to use your phone?
n
I could zip it up and send it to an email address later? Mostly pandas work, extracts and loads to sql server, pretty boring stuff.
👍 1
n
i asked mostly because I suspect there's some small detail thats causing weirdness, less because of the high level purpose of the flow
n
My worry is that maybe a library I’m including or that I’m importing several files in the initial flow file could be an issue, but it’s hard to track down as there’s no way to monitor what its trying to load along the way. Is there a good email address I could send the code to? I’m out at the moment but should be able to send it in an hour or so.
n
will DM!
👍 1