Does anybody have How Prefect Works TM diagrams or videos or Prefect Community #ask-community

Does anybody have "How Prefect Works (TM)" diagram...

Nathan Walker

03/24/2021, 8:07 PM

Does anybody have "How Prefect Works (TM)" diagrams (or videos or blogs)? I'm thinking something like https://docs.prefect.io/orchestration/server/architecture.html but more granular -- I'm trying to understand what's actually happening when a Flow is scheduled to run in Server, including how the Agent communicates with the Executor. Something like "A Flow/Task, from Start to Finish" would be mega helpful. I'm piecing things together from the docs and code, but if there's a hand-holding walkthrough or diagram available, I'm all ears.

Jim Crist-Harif

03/24/2021, 8:16 PM

Hi Nathan, We don't (currently) have anything like this in our docs, but it's been on the top of my todo list. We're hoping to have a docs sprint sometime in Q2, so hopefully soon-ish.

Jim Crist-Harif

03/24/2021, 8:17 PM

I'm happy to answer any specific questions you have in the meantime.

Nathan Walker

03/24/2021, 8:31 PM

Ah, OK, well thank you for being available to answer questions! My first question (please correct me if I'm misunderstanding something): Let's say I'm running Server locally and have a local Docker Agent running (local Docker, local Repository, all the locals). When the Agent gets word that it should run MyLovelyFlow, it gets the appropriate MyLovelyFlow image from the Repository and spins up a Docker container to run the Flow. Eventually, something calls MyLovelyFlow.run(...) in the container and magic happens. What happens between container startup and MyLovelyFlow.run(...)?

Jim Crist-Harif

03/24/2021, 8:33 PM

The container executes

prefect execute flow-run

, this will fetch your flow from storage (e.g. pull from S3, github, load from within the docker image, etc...). The flow then runs until completion, and the container shutsdown,

Nathan Walker

03/24/2021, 9:36 PM

Excellent! Thank you. This is the short version of what I think happens next: During Flow.run(...), Flow._run(...) creates a while not flow_state.is_finished() loop of the following. The Flow has a FlowRunner (eventually) run all the Tasks with FlowRunner.run(...). This eventually calls the Executor to start with a executor.start(...) context manager. The FlowRunner then loops once through each Task in order using FlowRunner.flow.sorted_tasks(), doing the following. For each Task, the Executor submits the Task with the run_task function using executor.submit(run_task, ...). This starts the TaskRunner, which eventually has the Executor run the task with prefect.utilities.executors.run_task_with_timeout(...) that eventually calls task.run(...), which is the thing I defined by putting the @task decorator over a function definition in my original .py file. First, does that seem right? Second, does the loop for self.flow.sorted_tasks() just kick off a Task in the Executor one after another (returning state = Running or something), or does it wait for each one to finish? If it only starts the Task, is this the purpose of the loop waiting on flow_state.is_finished() -- to just keep looping until all the Tasks are finished?

Jim Crist-Harif

03/24/2021, 10:23 PM

You're getting a bit in the weeds here, I'd define most of the above as implementation details (we might change them any day, and you as a user shouldn't need to care about them). But yes, that all sounds about right. Tasks are submitted to the executor but not awaited until we need the concrete results.

Jim Crist-Harif

03/24/2021, 10:24 PM

Is there a particular reason you're digging this deeply into the code? For a user all you should need to know is that your tasks are executed by the executor (so e.g. if running on a

DaskExecutor

, the tasks execute in the worker's environments).

Nathan Walker

03/25/2021, 2:39 PM

I think I got so far into the weeds because I'm trying to understand how to best use the Imperative API. Specifically, I'm trying to understand how Prefect thinks about upstream/downstream Tasks at runtime. I had an issue where I was misusing the upstream_task variable in the Functional API and wanted to understand what I was doing wrong and why things seemed to work with the Imperative API. In any case, thank you for your help and time here. I'm really enjoying learning about Prefect.

6 Views

Open in Slack

Previous Next