Hi folks -- I'm exploring Prefect Orion as an opti...
# prefect-community
h
Hi folks -- I'm exploring Prefect Orion as an option to replace Airflow. (I understand it's an evolving technical preview; this is for a longer-term R&D project.) First off, let me complement the team on the documentation. While maybe everything isn't documented yet, the documentation is excellently written. One question I had that was not immediately clear from the docs (and I've tried some searching, but haven't turned up an answer) is whether Deployments can be invoked with arbitrary parameters? It looked from the docs like I would need to create a different Deployment for each parameter? Maybe stepping back a bit, what I want to do is use an API to trigger "on-demand" pipeline work for a given input (e.g. building map tiles for a geo region). If Deployments aren't the right way to think about this problem, would it be better to just instantiate a Flow with a parameter [via API]? (I understand this to be possible based on some docs I was reading last night.) Ultimately, we would want to kick off these ETL or longer-running jobs and be able to get information about where in the system (which task in a flow) a specific job is. It feels like Prefect is a good fit for this workflow. Thanks in advance!
k
It wouldn’t be through the deployment interface but it would be by interacting with the REST API like you suggest. If you have parameters that you will consistently run, then yes you can have multiple deployments.
h
Ok, thank you. So a Deployment is not necessary when I'm actually deploying flows that I want to invoke via API -- I'm glad that I was correct in understanding that the deployment also includes any parameters (i.e. frozen in there), but under what conditions would I need a Deployment vs. just invoking a Flow via REST API?
k
Yes for event-driven flows you shouldn’t need. Deployment would be for scheduled flows (run every day), and then the scheduler will create those runs
z
This isn’t quite correct. I’ll try to give some bullets that explain the use-case for deployments: • You may run a flow by calling it in Python and it will be tracked by the API as a flow run. You can do this however you like wherever you like as long as you point it to your API. • If you want to trigger a flow run via the API, you need to “deploy” your flow by creating a deployment. • A deployment tells us how to get the code for a flow. Optionally, it may include a schedule to automatically trigger runs in a scheduled manner. Also optionally, it may override default parameters for the flow. • Once you have a deployment, you can create flow runs for it via the API. At this point, you may override the flow or deployment default values for parameters. • API triggered flow runs are submitted for execution by agents, which use a flow runner (the default is configured on the deployment but can be overridden per flow run) to create infrastructure, download your code from the location specified in the deployment, and start execution.
upvote 1
To create a flow run for a deployment with new parameters, you could use the
OrionClient.create_flow_run_from_deployment
method which allows you to pass parameters https://orion-docs.prefect.io/api-ref/prefect/client/#prefect.client.OrionClient.create_flow_run_from_deployment
k
Ah ok I was wrong. Michael’s answer makes complete sense upvote
z
The CLI doesn’t support this yet (
prefect deployment run
) because parsing parameters from the CLI into JSON is a bit of a pain and I haven’t implemented it, but we will definitely have it in the future.
upvote 1
h
Ah, ok, I think this just became clear to me, but will rephrase just to make sure I am understanding: • I can call flows from my python code all I want without creating Deployments. In this case it would also be my API code that is responsible for invoking these. It sounds like if we wanted to keep this appropriately decoupled from our webapp backend, we'd write a think wrapper API that kicks off the flows in this manner. • If I want to use the Orion API to trigger the flows, then they need to be turned into Deployments, but there is a mechanism to do this with an API. (Though I imagine doing this for adhoc queries could dirty up the system over time, so maybe doesn't make the most sense.). I saw some good blog posts on Prefect 1.0 code organization recommendations, but I think that might be a valuable thing to do eventually for Orion as well. Getting it up and running locally was super easy and love how magically things I'm running in console are showing up in the server. It became less clear to me how I'd manage large codebases of tasks and flows and what the entrypoints would look like for this / how these get packed into Docker containers. Anyway, I think this just comes down to trying to prototype out one of our current DAGs in Airflow. Not intending to derail this thread, though. I really appreciate the help!
k
Creating a Deployment using the API is not recommend. There is so much logic that converts a Deployment to the appropriate API request that it would be very painful to replicate on your end. I think the recommendation would be to make a deployment using the Python client then just invoke. Yes we’ll definitely make material around that as Orion comes off technical preview. I suspect there are limitations to just invoking the Python code without deployments. As a quick example, i don’t think you can tag Flows through the Flow object. You need it on the deployment. Also, the execution environment (FlowRunner) is specified on the DeploymentSpec. Michael might correct me though
👍 1