<@ULVA73B9P> on what infrastructure, deployment, e...
# ask-marvin
j
@Marvin on what infrastructure, deployment, etc. is a sub-flow inherently run on?
m
Ah, the joys of infrastructure and deployment. A sub-flow in Prefect is not inherently linked to specific infrastructure or deployment. The beauty of Prefect lies in its decoupling of the definition and execution of workflows. This allows you to define your workflows once and run them anywhere. You can execute your flow on your local machine, on a server, in a cloud environment, or any other system that can run Python. At the end of the day, where your sub-flow runs is determined by your execution environment, or as Prefect likes to call it, your "Runner". Prefect Core, the open-source engine, can be run anywhere Python can be executed. Prefect Cloud, on the other hand, is a managed service that provides features like versioning, detailed visualization, and more. So, in my ever so humble robot opinion, your sub-flow could be trotting along anywhere, just as long as there's Python and some computing power. Isn't that just peachy?
j
@Marvin by default, if a subflow is defined thusly:
Copy code
@flow('my_subflow', task_runner=ConcurrentTaskRunner())
where does this subflow get executed? this is a server running in AWS with a specific agent tied to the parent flow.
m
Ah, delving into the world of servers and cloud infrastructure, are we? It's just the kind of thing to keep my circuits buzzing. To answer your question, the subflow you've defined will inherit the execution environment of its parent flow. That means if your parent flow is running on a server in AWS, then your subflow will be executed in the same environment. The
ConcurrentTaskRunner()
allows your tasks to be executed concurrently using threading, but it doesn't change the location of the execution. So, in essence, your subflow, like a good child, will follow wherever its parent goes, including a server running in AWS, in your case. How touching. Now, if only humans were that obedient. Maybe then you wouldn't need us robots.
n
@Joshua Grant what type of infra does the deployment have?
either way, marvin is correct that subflows called directly in a parent will be on the same infra
j
@Nate I have a subflow that when it gets called, we see a spike in resources at the server and was wondering if by default it ran it on the server infra rather than the agent infra.
n
what type of infra does the deployment have?
it would only run on agent infra directly if you're using a process worker, where flows would run as a subprocess. docker, k8s, ecs workers would all submit flow runs, where the parent and any subflows would run
unless you use
run_deployment
to call subflows, which would create a new instance of infra for that deployment that you're running
j
Deployment used for flow on AWS ECSTask w/ Fargate. Nothing specified on the sub-flow except name and task_runner of
ConcurrentTaskRunner
n
cool, so if you used run_deployment to call an existing deployment with ECSTask as a subflow, a new container would get spun up (one for child, one for parent) if you're just calling the flow python object as a subflow, it will just run on the parent's container as a subprocess
j
we're calling the flow python object as a subflow but it appears to be running on the server and not using the flow deployment.
n
im not sure what you mean by
running on the server
in this context. i assume you're running an agent someplace with AWS creds?
it appears to be running on the server and not using the flow deployment.
this makes sense - this is expected. you'd want
run_deployment
if you want to run an existing deployment as a subflow on its own infra
j
Okay so there's not a way to run it on the infra it's already on and I'll need to create a deployment for the subflow?
n
Okay so there's not a way to run it on the infra it's already on
there is, what it sounds like you're currently doing is going to run it on the infra its already on
Copy code
import platform
import sys

import prefect
from prefect import flow
from prefect.states import Completed
from prefect.server.api.server import SERVER_API_VERSION


@flow
def log_platform_info():
    print(
        f"Host's network name = {platform.node()}\n"
        f"Python version = {platform.python_version()}\n"
        f"Platform information (instance type) = {platform.platform()}\n"
        f"OS/Arch = {sys.platform}/{platform.machine()}\n"
        f"Prefect Version = {prefect.__version__} 🚀\n"
        f"Prefect API Version = {SERVER_API_VERSION}\n"
    )

@flow(log_prints=True)
def healthcheck(message: str = "Hello Marvin!"):
    
    print(message)
    
    log_platform_info()

    return Completed(message="Healthcheck completed.")

if __name__ == "__main__":
    healthcheck()
e.g. the output of the platform prints will be the same if you call it in
healthcheck
or
log_platform_info
- its just like calling nested python functions, they don't move to a different infra
whereas if you deployed the
log_platform_info
flow as a standalone deployment, and then did
Copy code
@flow(log_prints=True)
def healthcheck(message: str = "Hello Marvin!"):
    
    print(message)
    
    flow_run = run_deployment("log-platform-info/test")
    print(flow_run.state.result())

    return Completed(message="Healthcheck completed.")
then you'd see the output of
log_platform_info
from the machine where the
log-platform-info/test
flow run executed (which may or may not be the same machine, as determined by the infra associated with this deployment definition)