Hwi Moon
01/10/2022, 9:00 AMSuresh R
01/10/2022, 9:02 AMBruno Murino
01/10/2022, 9:51 AMWARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='<http://api.prefect.io|api.prefect.io>', port=443): Read timed out. (read timeout=15)")': /
it got those many times, then the job was considered dead and etc.
Does anyone know why this could have happened?Henrietta Salonen
01/10/2022, 10:39 AMShivam Bhatia
01/10/2022, 11:15 AMPrudhvi Kalakota
01/10/2022, 11:32 AMAndy Waugh
01/10/2022, 11:42 AMHenrietta Salonen
01/10/2022, 2:08 PMVamsi Reddy
01/10/2022, 3:03 PMprefect.exceptions.AuthorizationError: [{'path': ['project'], 'message': 'AuthenticationError: Forbidden', 'extensions': {'code': 'UNAUTHENTICATED'}}]
Joshua S
01/10/2022, 3:05 PMAlvaro Durán Tovar
01/10/2022, 3:18 PMRunNamespacedJob
no success so farMiroslav Rác
01/10/2022, 3:25 PMimport prefect
from prefect import Flow, task
@task
def test_task():
return 'ok'
with Flow('test_flow') as flow:
test_task()
flow.register(project_name="other")
my CLI:
> prefect backend server
Backend switched to server
> prefect server create-tenant --name default
Tenant created with ID: 6bf4ef79-ddcb-4ea3-8ea2-4dcab05a375a
> prefect agent local start
[2022-01-10 15:12:15,508] INFO - agent | Starting LocalAgent with labels ['72f026a71863']
...
> python /home/testflow.py
Flow URL: <http://localhost:8080/default/flow/bf9d5401-9034-4ec0-8e43-9149b0718d23>
└── ID: 5e6af0a7-d5af-47bf-831b-b0a4fb4d3401
└── Project: other
└── Labels: ['72f026a71863']
when I open the URL directly, I can see the flow in the UI. I can run it via “quick run” and it runs.
however, the flow is not listed in my project nor the run is visible under the flow (when accessed via URL, I see no activity and no run history).
on the main dashboard, the run activity is showed but seem like it went astray since it is not assigned to the flow.
the “flows” tab shows “you have no flows in this project”
SETUP:
I created docker-compose.yaml
via prefect server config
and changed hasura image to hasura/graphql-engine:latest
because the default hasura image is not working fine with my mac M1 pro.
I added another service from which I try to register and run flows
client:
image: python:3.8.12-slim
command: bash -c "apt-get update -y && apt-get install gcc -y && pip install prefect[dev] && prefect backend server && tail -f /dev/null"
volumes:
- ./client:/home # here I have my testflow.py
networks:
prefect-server: null
environment:
- PREFECT__SERVER__HOST=<http://apollo>
EDIT: added info on my setupAmichai Ben Ami
01/10/2022, 3:45 PMstorage = flows_storage(registry_url, image_name, tag)
storage = Docker(
dockerfile="Dockerfile",
image_tag=tag,
registry_url=registry_url,
image_name=image_name,
)
storage.build(push=False)
traceback in the Thread attached.
any idea?
maybe timeout?
Anyway to increase timeout?
ThanksE Li
01/10/2022, 4:49 PMDanny Vilela
01/10/2022, 5:36 PMIntervalSchedule
for daily runs, if that helps.Koby Kilimnik
01/10/2022, 5:39 PMKoby Kilimnik
01/10/2022, 5:39 PMKoby Kilimnik
01/10/2022, 5:40 PMchelseatroy
01/10/2022, 6:24 PMDaniel Komisar
01/10/2022, 7:18 PMIan Andres Etnyre Mercader
01/10/2022, 7:39 PMMiroslav Rác
01/10/2022, 7:47 PMHenrietta Salonen
01/10/2022, 9:08 PMNohup prefect agent docker start --label dev --env VAR1=<variable1> --env VAR2=<variable2> &
• Get the PIDs for the old agents with old environment variables ps -ef | grep prefect
• Kill the old agents kill <PID>
As this is not a very smooth approach, I’d like to hear how others have approached automating the Agent part when setting up CI processes. Any tips?Aric Huang
01/10/2022, 11:30 PMThe zone 'projects/<project>/zones/<zone>' does not have enough resources available to fulfill the request
, and we've seen them take >10m for the node pool to finish scaling an instance up even when they're available. My understanding is that the Lazarus process will treat a flow that hasn't started running for more than 10m as a failure and trigger another flow run - however this is the behavior i've been seeing:
Flow A run -> Zone is out of resources -> Flow B run by Lazarus after 10m -> Flow C run by Lazarus after 10m -> After 30m, Lazarus marks the flow as failed (A Lazarus process attempted to reschedule this run 3 times without success. Marking as failed.
) -> Pods for flows A-C are still stuck in Pending state on Kubernetes and have to be manually deleted
Given this use case would you recommend disabling the Lazarus process altogether for these flows? The ideal behavior for us would be for the flow to wait until an instance can be scaled up, even if it takes a few hours. It would be nice if we could specify a time limit also.
Also, is it expected for there to be "zombie" Kubernetes pods/jobs left over in a case like this, and are there any recommended ways to deal with that? I'm not sure what would happen if resources suddenly became available after the Lazarus process stopped all the flows, but before we find them and manually clean them up - would they still run even though the flow has been failed in Prefect? Ideally once a flow is failed we'd like any pending pods/jobs for that flow to be deleted automatically, not sure if that's possible.Leon Kozlowski
01/10/2022, 11:46 PMprefect register --project "<PROJECT_NAME>" --json flow.json --label dev --force
Collecting flows...
Processing 'flow.json':
Registering '<FLOW_NAME>'... Done
└── ID: <ID>
└── Version: 3
But when I run my flow, some of the logic that changed, specifically an output file naming convention is not being used. I’m not sure if this has something to do with an image pull policy in my agent helm chart, or something that I am missing at registration timeKyle McChesney
01/10/2022, 11:47 PMRegisterTaskDefinition
API call is being made with no parametersAmogh Kulkarni
01/11/2022, 1:21 AMAnh Nguyen
01/11/2022, 10:43 AMShivam Bhatia
01/11/2022, 10:44 AMFailed to retrieve task state with error: ClientError([{'path': ['get_or_create_task_run_info'], 'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'locations': [{'line': 2, 'column': 101}], 'path': None}}}])
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/prefect/engine/cloud/task_runner.py", line 154, in initialize_run
task_run_info = self.client.get_task_run_info(
File "/usr/local/lib/python3.8/site-packages/prefect/client/client.py", line 1798, in get_task_run_info
result = self.graphql(mutation) # type: Any
File "/usr/local/lib/python3.8/site-packages/prefect/client/client.py", line 569, in graphql
raise ClientError(result["errors"])
prefect.exceptions.ClientError: [{'path': ['get_or_create_task_run_info'], 'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'locations': [{'line': 2, 'column': 101}], 'path': None}}}]
Aqib Fayyaz
01/11/2022, 11:02 AMThe command '/bin/sh -c pip install "prefect[all_extras]"' returned a non-zero code: 1
Aqib Fayyaz
01/11/2022, 11:02 AMThe command '/bin/sh -c pip install "prefect[all_extras]"' returned a non-zero code: 1
Anna Geller
01/11/2022, 11:30 AMRUN pip install "prefect[gcp]"
Aqib Fayyaz
01/11/2022, 11:32 AMRUN pip install "prefect[dev,templates,viz,kubernetes,google,github]"
i included github as my flow is stored on githubKevin Kho
01/11/2022, 2:37 PM