Hui Zheng
10/13/2020, 9:10 PMCloudpickle serialization check
step of the health-check during flow.register(). Could you help?
The error message is below
Beginning health checks...
System Version check: OK
Traceback (most recent call last):
File "/opt/prefect/healthcheck.py", line 147, in <module>
flows = cloudpickle_deserialization_check(flow_file_paths)
File "/opt/prefect/healthcheck.py", line 40, in cloudpickle_deserialization_check
flows.append(cloudpickle.load(f))
File "/usr/local/lib/python3.8/site-packages/environs/__init__.py", line 334, in __getattr__
return functools.partial(self.__custom_parsers__[name], self)
File "/usr/local/lib/python3.8/site-packages/environs/__init__.py", line 334, in __getattr__
return functools.partial(self.__custom_parsers__[name], self)
File "/usr/local/lib/python3.8/site-packages/environs/__init__.py", line 334, in __getattr__
return functools.partial(self.__custom_parsers__[name], self)
[Previous line repeated 994 more times]
RecursionError: maximum recursion depth exceeded
Removing intermediate container 98613182a8f2
The command '/bin/sh -c python /opt/prefect/healthcheck.py '["/opt/prefect/flows/20-06-0-standard-ag-stage.prefect"]' '(3, 8)'' returned a non-zero code: 1
Traceback (most recent call last):
File "build_and_register.py", line 174, in <module>
flow.register(
File "/usr/local/lib/python3.8/site-packages/prefect/core/flow.py", line 1588, in register
registered_flow = client.register(
File "/usr/local/lib/python3.8/site-packages/prefect/client/client.py", line 734, in register
serialized_flow = flow.serialize(build=build) # type: Any
File "/usr/local/lib/python3.8/site-packages/prefect/core/flow.py", line 1439, in serialize
storage = self.storage.build() # type: Optional[Storage]
File "/usr/local/lib/python3.8/site-packages/prefect/environments/storage/docker.py", line 346, in build
self._build_image(push=push)
File "/usr/local/lib/python3.8/site-packages/prefect/environments/storage/docker.py", line 412, in _build_image
raise ValueError(
ValueError: Your docker image failed to build! Your flow might have failed one of its deployment health checks - please ensure that all necessary files and dependencies have been included.
Please see more context and the details in the thread.flow.storage.build(push=True)
also did the health-check and it passed. however, the health-check failed at the second line of code at flow.register()
flow.storage.build(push=True)
flow.register(
project_name='release_{}_{}'.format(target_release_u, environment),
# we will manually turn it on, for now.
set_schedule_active=False
)
Jim Crist-Harif
10/13/2020, 9:17 PMbuild=False
to register
?Hui Zheng
10/13/2020, 9:19 PMbuild=False
, and it worked!
Step 21/21 : RUN python /opt/prefect/healthcheck.py '[]' '(3, 8)'
---> Running in e0bb97c170de
Beginning health checks...
System Version check: OK
Cloudpickle serialization check: OK
Result check: OK
Environment dependency check: OK
All health checks passed.
Removing intermediate container e0bb97c170de
---> 5684cb902ac8
Successfully built 5684cb902ac8
Successfully tagged <http://us.gcr.io/semios-data-platform/stage/prefect-scheduler/standard_ag:20.06.0|us.gcr.io/semios-data-platform/stage/prefect-scheduler/standard_ag:20.06.0>
[2020-10-13 21:18:35] INFO - prefect.Docker | Pushing image to the registry...
Pushing [==================================================>] 8.192kB/5.222kB
Flow URL: <https://cloud.prefect.io/semios/flow/5562476d-68be-4d28-abfd-9ef05d24b086>
└── ID: 5562476d-68be-4d28-abfd-9ef05d24b086
└── Project: release_20_06_0_stage
└── Labels: ['20.06.0', 'stage']
Jim Crist-Harif
10/13/2020, 9:23 PMstorage.build
, the flow is already built - calling it again in flow.register
isn't necessary. If you don't have a good reason for manually calling flow.storage.build()
above, then I'd recommend removing calling flow.storage.build
entirely and let flow.register
handle it.Hui Zheng
10/13/2020, 9:26 PM# flow.storage.build(push=True)
flow.register(
project_name='release_{}_{}'.format(target_release_u, environment),
# we will manually turn it on, for now.
set_schedule_active=False
)
error:
Beginning health checks...
System Version check: OK
Traceback (most recent call last):
File "/opt/prefect/healthcheck.py", line 147, in <module>
flows = cloudpickle_deserialization_check(flow_file_paths)
File "/opt/prefect/healthcheck.py", line 40, in cloudpickle_deserialization_check
flows.append(cloudpickle.load(f))
File "/usr/local/lib/python3.8/site-packages/environs/__init__.py", line 334, in __getattr__
return functools.partial(self.__custom_parsers__[name], self)
File "/usr/local/lib/python3.8/site-packages/environs/__init__.py", line 334, in __getattr__
return functools.partial(self.__custom_parsers__[name], self)
File "/usr/local/lib/python3.8/site-packages/environs/__init__.py", line 334, in __getattr__
return functools.partial(self.__custom_parsers__[name], self)
[Previous line repeated 994 more times]
RecursionError: maximum recursion depth exceeded
Removing intermediate container f006268f1451
Jim Crist-Harif
10/13/2020, 9:31 PMHui Zheng
10/13/2020, 10:54 PMregister(build=False)
,
flow.storage.build(push=True)
flow.register(
project_name='release_{}_{}'.format(target_release_u, environment),
# we will manually turn it on, for now.
set_schedule_active=False
, build=False
)
Though The flow is deployed to prefect-cloud, it fails when it runs, with the error below
[13 Oct 2020 3:38pm]: Failed to load and execute Flow's environment: KeyError('20_06_0_standard_ag_stage')
NOTE: I think 20_06_0_standard_ag_stage
is the flow name. I also noticed that the flow is registered with environment=local_environment
, which has always been this since very beginning and had been working, but I wonder if it’s correct because our prefect-cloud agent is hosted on k8e cluster.environs
in my code.
Previously, I instantiate env
inside the task
from environs import Env
@task(
max_retries=2
)
def dbt_build_run_commands(models: list) -> list:
env = Env()
forecast = env.bool('DBT_RUN_FORECAST', False)
with Flow('Data Platform Flow') as flow:
# define flow ...
It worked fine for all, flow.storage.build(push=True)
, local-flow-run and register-to-prefect-cloud
After the change, I instantiate env
globally outside the task, because I have other tasks that aksi use env
from environs import Env
env = Env()
@task(
max_retries=2
)
def dbt_build_run_commands(models: list) -> list:
forecast = env.bool('DBT_RUN_FORECAST', False)
with Flow('Data Platform Flow') as flow:
# define flow ...
It
It works for flow.storage.build(push=True)
and local-flow-run. however, when it comes to register-to-prefect-cloud
, it failed. If I switch the env
back to the previous way, it works for all. Why is that? Is it related to cloudpickle
?
P.S., the code above is in a file called flow.py
the build-and-register code below is in a separate .py file that imports flow from flow.py
from flow import flow
flow.environment = LocalEnvironment(labels=labels)
flow.storage = Docker( ... )
flow.register(.. )
Jim Crist-Harif
10/15/2020, 9:26 PMEnv
is pickleable, either method should work fine, so I'm surprised at the results you're getting here.Hui Zheng
10/15/2020, 9:28 PMflow.storage.build(push=True)
works, but the storage build inside register()
doesn’tJim Crist-Harif
10/15/2020, 9:29 PM