Gagan Singh Saluja
to create several flow runs (using ECSRun), and then polling each with
to know when all the flows have completed. When one of the flows fails (and prefect starts a new flow run to take its place), how can we check when the rerun is complete (and whether it succeeded)?
For 20+ minutes following that log message, fetching the flow run state from prefect cloud still shows
No heartbeat detected from the remote task; marking the run as failed.
Ideally, as soon as the flow run is marked as failed, state from prefect cloud would say Failed. Suggestions?
<Running: "Running flow.">
support graceful shutdown? We're running flows as Kubernetes jobs on EC2 spot instances, which get terminated from time to time. Let's say the job starts a pod
on a node that terminates. Kubernetes will quickly reschedule it and spawn a pod
does nothing because the state of the tasks is
, but they are not (
is dead). We need to wait for the heartbeat to timeout for Prefect to reschedule it which takes a long time. Instead setting something like
to tasks that are
when a SIGTERM is received would be nice.
The number of ids retrieved can vary by a few orders of magnitude and I cannot predict it. The issue I see is that while the flow runs, the memory footprint keeps increasing which sometimes results in an OOM kill of the pod running the flow. Is there any way to have the memory footprint be near constant with regard to the number of executed mapped tasks in the flow ? I understand that the initial mapping requires a bunch of memory and that there is no way around it. I am running on K8S, using a LocalDaskExecutor (threaded) and had hoped that depth first execution would mean there would be some amount of garbage collecting with fully executed branches. Would setting up a
ids = get_ids() step_1 = do_step_1.map(ids) step_2 = do_step_2.map(step_1) step_3 = do_step_3.map(step_2)
in the mapped tasks help in any way ? I tried manually releasing memory inside the tasks code (with
mostly) but saw no improvement. Another solution I see would be to have steps 1-3 be executed in their own separate flow but that means we would spin up a bunch of ephemeral pods and lengthen the flow overall I suppose ? Thanks !
which I think lead to double runs. From what I understand
shouldn’t be a terminal state?
from analytics_toolbox.analytics_toolbox import * from analytics_toolbox.analytics_toolbox import PendoUtils as pendo_utils from analytics_toolbox.analytics_toolbox import SnowflakeUtils as snowflake_utils from prefect import task, Flow from prefect.utilities.notifications import slack_notifier @task () def main(): ...... with Flow("pendo_featureEvents") as flow: main() flow.register(project_name="pendo_api")
file . But as i access it at
, the page displays but then redirects to the getting started page . So basically i want to access my prefect instance from any other machine.
[server] [server.ui] apollo_url="<http://ip.add.re.ss:4200/graphql>"