DL
03/10/2022, 1:16 PMBrian Phillips
03/10/2022, 2:35 PMAn error occurred (ThrottlingException) when calling the RegisterTaskDefinition operation (reached max retries: 2): Rate exceeded
Constantino Schillebeeckx
03/10/2022, 3:29 PM@task
def sleep_5():
print(f'sleeping 5')
time.sleep(5)
print(f'done sleeping 5')
@task
def sleep_x(x):
print(f'sleeping {x}')
time.sleep(x)
print(f'done sleeping {x}')
with DHFlow("foo") as flow:
sleep_5()
sleep_x.map([7, 8])
when I execute the above, sleep_5
is run first; only after it finishes will sleep_x
run.Pedro Machado
03/10/2022, 4:26 PMNo heartbeat detected from the remote task; retrying the run.This will be retry 1 of 2.
However, the last message was written 12 hours ago. It does not look like the flow got retried at all.
Could someone help me figure out what may have happened? I am using ECS to run the flows.
The flow run is fed29ba9-66a6-4f0b-aa69-cf9db908bd58
I went ahead and canceled it but it does not offer me the option to restart it. How can I restart it and preserve the same flow run ID so that the tasks that succeeded are not executed again? Ideally, I'd rely on the task state but if they need to run again, I am hoping that caching will help avoid doing the work again. The cache key is based on the flow_run_id
.
Thanks!Olivér Atanaszov
03/10/2022, 4:45 PMn_workers
agents as kubernetes jobs similarly to this:
# Agent flow
with Flow("run-agent") as flow_agent:
ShellTask(agent_command) # launch agent computing tasks
@task
def run_agent(worker_id):
run_agent_flow_id = create_flow_run.run("run-agent",
name=f"run-agent_#{worker_id}")
return run_agent_flow_id
# Main flow
with Flow("main") as flow:
n_workers = Parameters("n_workers", default=2)
worker_ids = get_worker_ids(n_workers) # [0, 1]
run_agent.map(worker_id=worker_ids)
When the agents finish their tasks, for some reason the kubernetes jobs are not terminating. Inside the job's container apparently the agent's process terminates, but I see prefect execute flow-run
and prefect hearbeat flow-run -i ...
being stuck.Ling Chen
03/10/2022, 5:18 PMSarah Floris
03/10/2022, 6:49 PMDavid Yang
03/10/2022, 7:02 PMGabriel Milan
03/10/2022, 8:13 PMbot_schedule = Schedule(
clocks=[
IntervalClock(
interval=timedelta(hours=1),
start_date=datetime(2021, 1, 1),
labels=[
...,
],
parameter_defaults={
...,
},
),
],
)
At first, we thought it was an UI issue. But then, we've confirmed that the flow wasn't being scheduled at all for this time window. Any ideas?Stephen Herron
03/10/2022, 9:38 PMRio McMahon
03/10/2022, 10:33 PMNo heartbeat detected from the flow run; marking the run as failed.
. The SQL query takes ~45 seconds to run on my local machine so I am curious what interval the zombie killer process polls at and if you have any suggestions for debugging this interruption. The flow/associated scripts run fine on my local machine. Thanks.Brett Naul
03/11/2022, 1:02 AMkevin
03/11/2022, 4:08 AMselect * from table
to snowflake using a SnowflakeQuery task. It appears that this task uses fetchall()
to return this data into memory: https://github.com/PrefectHQ/prefect/blob/5d2732a30563591410cb11fe0f7e7dfe65cc5669/src/prefect/tasks/snowflake/snowflake.py#L186
I expect that this causes performance issues with extremely large queries so I am wondering what Prefect Cloud’s tolerance for this. Ideally I think it would be preferable to lazy load query results and/or allow for query pagination? Perhaps there’s an architecture limitation I’m overlooking? I’d appreciate any insight 🙂satyapal reddy
03/11/2022, 6:00 AMsatyapal reddy
03/11/2022, 6:01 AMsatyapal reddy
03/11/2022, 6:01 AMLiezl Puzon
03/11/2022, 6:20 AMJ. Martins
03/11/2022, 9:39 AMale
03/11/2022, 10:36 AMwith raw_data as
(
select "start_time"::date as task_run_date,
state,
count(id) as task_run_per_date_and_state
from public.task_run
group by 1,2
)
select state,
avg(task_run_per_date_and_state)::int as avg_task_runs_per_day
from raw_data
where state = 'Success'
group by 1
order by 1
Vipul
03/11/2022, 11:27 AMLalit Pagaria
03/11/2022, 12:09 PMJean-Baptiste Six
03/11/2022, 12:42 PMAdi Gandra
03/11/2022, 3:32 PM"cpu_request": "4",
On describe pod that is spun up:
Requests:
cpu: 6
memory: 16Gi
Why is this happening on the restarted flow run that i’m trying?Constantino Schillebeeckx
03/11/2022, 3:45 PMStartFlowRun(...).run()
- when this executes in the cloud I'm seeing: 🧵Chris Reuter
03/11/2022, 4:11 PMAndreas Nord
03/11/2022, 4:34 PMflow.run_configs = DockerRun(image="myrepo/image")
flow.register(project_name)
But it shows up incorrectly as UniversalRun in cloud UI.
If I add the runconfig when I define the flow it works perfectly:
with Flow("myflow", DockerRun(image="myrepo/image") as flow:
Any suggestion to what I am doing wrong in the first approach would be appreciatedTim Enders
03/11/2022, 8:54 PMBradley Hurley
03/11/2022, 9:25 PMDavid Beck
03/11/2022, 9:27 PMkensuke matsuura
03/12/2022, 10:41 AMkensuke matsuura
03/12/2022, 10:41 AMKevin Kho
03/12/2022, 4:07 PM