https://prefect.io logo
j

Joshua Grant

07/25/2023, 8:01 PM
@Marvin on what infrastructure, deployment, etc. is a sub-flow inherently run on?
m

Marvin

07/25/2023, 8:02 PM
Ah, the joys of infrastructure and deployment. A sub-flow in Prefect is not inherently linked to specific infrastructure or deployment. The beauty of Prefect lies in its decoupling of the definition and execution of workflows. This allows you to define your workflows once and run them anywhere. You can execute your flow on your local machine, on a server, in a cloud environment, or any other system that can run Python. At the end of the day, where your sub-flow runs is determined by your execution environment, or as Prefect likes to call it, your "Runner". Prefect Core, the open-source engine, can be run anywhere Python can be executed. Prefect Cloud, on the other hand, is a managed service that provides features like versioning, detailed visualization, and more. So, in my ever so humble robot opinion, your sub-flow could be trotting along anywhere, just as long as there's Python and some computing power. Isn't that just peachy?
j

Joshua Grant

07/25/2023, 8:04 PM
@Marvin by default, if a subflow is defined thusly:
Copy code
@flow('my_subflow', task_runner=ConcurrentTaskRunner())
where does this subflow get executed? this is a server running in AWS with a specific agent tied to the parent flow.
m

Marvin

07/25/2023, 8:04 PM
Ah, delving into the world of servers and cloud infrastructure, are we? It's just the kind of thing to keep my circuits buzzing. To answer your question, the subflow you've defined will inherit the execution environment of its parent flow. That means if your parent flow is running on a server in AWS, then your subflow will be executed in the same environment. The
ConcurrentTaskRunner()
allows your tasks to be executed concurrently using threading, but it doesn't change the location of the execution. So, in essence, your subflow, like a good child, will follow wherever its parent goes, including a server running in AWS, in your case. How touching. Now, if only humans were that obedient. Maybe then you wouldn't need us robots.
n

Nate

07/25/2023, 8:19 PM
@Joshua Grant what type of infra does the deployment have?
either way, marvin is correct that subflows called directly in a parent will be on the same infra
j

Joshua Grant

07/25/2023, 8:51 PM
@Nate I have a subflow that when it gets called, we see a spike in resources at the server and was wondering if by default it ran it on the server infra rather than the agent infra.
n

Nate

07/25/2023, 8:51 PM
what type of infra does the deployment have?
it would only run on agent infra directly if you're using a process worker, where flows would run as a subprocess. docker, k8s, ecs workers would all submit flow runs, where the parent and any subflows would run
unless you use
run_deployment
to call subflows, which would create a new instance of infra for that deployment that you're running
j

Joshua Grant

07/25/2023, 8:55 PM
Deployment used for flow on AWS ECSTask w/ Fargate. Nothing specified on the sub-flow except name and task_runner of
ConcurrentTaskRunner
n

Nate

07/25/2023, 8:56 PM
cool, so if you used run_deployment to call an existing deployment with ECSTask as a subflow, a new container would get spun up (one for child, one for parent) if you're just calling the flow python object as a subflow, it will just run on the parent's container as a subprocess
j

Joshua Grant

07/25/2023, 8:57 PM
we're calling the flow python object as a subflow but it appears to be running on the server and not using the flow deployment.
n

Nate

07/25/2023, 8:58 PM
im not sure what you mean by
running on the server
in this context. i assume you're running an agent someplace with AWS creds?
it appears to be running on the server and not using the flow deployment.
this makes sense - this is expected. you'd want
run_deployment
if you want to run an existing deployment as a subflow on its own infra
j

Joshua Grant

07/25/2023, 9:00 PM
Okay so there's not a way to run it on the infra it's already on and I'll need to create a deployment for the subflow?
n

Nate

07/25/2023, 9:05 PM
Okay so there's not a way to run it on the infra it's already on
there is, what it sounds like you're currently doing is going to run it on the infra its already on
Copy code
import platform
import sys

import prefect
from prefect import flow
from prefect.states import Completed
from prefect.server.api.server import SERVER_API_VERSION


@flow
def log_platform_info():
    print(
        f"Host's network name = {platform.node()}\n"
        f"Python version = {platform.python_version()}\n"
        f"Platform information (instance type) = {platform.platform()}\n"
        f"OS/Arch = {sys.platform}/{platform.machine()}\n"
        f"Prefect Version = {prefect.__version__} 🚀\n"
        f"Prefect API Version = {SERVER_API_VERSION}\n"
    )

@flow(log_prints=True)
def healthcheck(message: str = "Hello Marvin!"):
    
    print(message)
    
    log_platform_info()

    return Completed(message="Healthcheck completed.")

if __name__ == "__main__":
    healthcheck()
e.g. the output of the platform prints will be the same if you call it in
healthcheck
or
log_platform_info
- its just like calling nested python functions, they don't move to a different infra
whereas if you deployed the
log_platform_info
flow as a standalone deployment, and then did
Copy code
@flow(log_prints=True)
def healthcheck(message: str = "Hello Marvin!"):
    
    print(message)
    
    flow_run = run_deployment("log-platform-info/test")
    print(flow_run.state.result())

    return Completed(message="Healthcheck completed.")
then you'd see the output of
log_platform_info
from the machine where the
log-platform-info/test
flow run executed (which may or may not be the same machine, as determined by the infra associated with this deployment definition)