Pedro Machado
12/22/2021, 7:46 PMPedro Machado
12/22/2021, 7:46 PMprefect register --project radar -p radar/flows/li_company_flow.py --label prod
Collecting flows...
Processing 'radar/flows/li_company_flow.py':
Building `Docker` storage...
Error building storage:
Traceback (most recent call last):
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/cli/build_register.py", line 463, in build_and_register
storage.build()
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/storage/docker.py", line 308, in build
self._build_image(push=push)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/storage/docker.py", line 339, in _build_image
dockerfile_path = self.create_dockerfile_object(directory=tempdir)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/storage/docker.py", line 474, in create_dockerfile_object
f.write(flow_to_bytes_pickle(self._flows[flow_name]))
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/utilities/storage.py", line 177, in flow_to_bytes_pickle
cloudpickle.dumps(flow, protocol=4), newline=False
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 73, in dumps
cp.dump(obj)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 602, in dump
return Pickler.dump(self, obj)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 316, in _file_reduce
raise pickle.PicklingError(
_pickle.PicklingError: Cannot pickle files that are not opened for reading: a
Registering 'li_get_company_data'... Error
================== 0 registered, 1 errored ==================
Anna Geller
Pedro Machado
12/22/2021, 8:35 PMstorage = Docker(
registry_url=os.getenv("ECR_REPO_URL"),
image_name="prefect-ecs-prod",
python_dependencies=[
"requests-cache~=0.8.1",
"tenacity~=8.0.1",
"ratelimiter~=1.2.0",
"loguru~=0.5.3",
"snowflake-connector-python'>=1.8.2,<2.5'",
],
files={
REPO_BASE_DIR / "radar/radar/linkedinapi.py": "/modules/radar/linkedinapi.py",
REPO_BASE_DIR
/ "radar/radar/retry_strategy.py": "/modules/radar/retry_strategy.py",
},
env_vars={"PYTHONPATH": "$PYTHONPATH:/modules/"},
)
I assign it to Flow(storage=storage)
in the context manager block.
When I build it with:
built_storage = flow.storage.build(push=False)
print(f"{built_storage.flows=}")
I see this:
[2021-12-22 14:31:45] INFO - prefect.Docker | Building the flow's Docker storage...
Step 1/10 : FROM prefecthq/prefect:0.15.10-python3.8
---> 330a9f90a2be
Step 2/10 : ENV PYTHONPATH='$PYTHONPATH:/modules/' PREFECT__USER_CONFIG_PATH='/opt/prefect/config.toml'
---> Using cache
---> ec1dc84f2d16
Step 3/10 : RUN pip install pip --upgrade
---> Using cache
---> 80ecdf097086
Step 4/10 : RUN pip show prefect || pip install git+<https://github.com/PrefectHQ/prefect.git@0.15.10#egg=prefect[all_orchestration_extras]>
---> Using cache
---> 7d8c675fab87
Step 5/10 : RUN pip install requests-cache~=0.8.1 tenacity~=8.0.1 ratelimiter~=1.2.0 loguru~=0.5.3 snowflake-connector-python'>=1.8.2,<2.5' wheel
---> Using cache
---> b4e1830a52ed
Step 6/10 : RUN mkdir -p /opt/prefect/
---> Using cache
---> ae7547e9965d
Step 7/10 : COPY healthcheck.py /opt/prefect/healthcheck.py
---> Using cache
---> 303931044515
Step 8/10 : COPY linkedinapi.py /modules/radar/linkedinapi.py
---> Using cache
---> 40f2fa9107aa
Step 9/10 : COPY retry_strategy.py /modules/radar/retry_strategy.py
---> Using cache
---> ae26b25c34d8
Step 10/10 : RUN python /opt/prefect/healthcheck.py '[]' '(3, 8)'
---> Using cache
---> 32e27568f61d
Successfully built 32e27568f61d
Successfully tagged <account>.<http://dkr.ecr.us-east-1.amazonaws.com/prefect-ecs-prod:2021-12-22t20-31-42-592614-00-00|dkr.ecr.us-east-1.amazonaws.com/prefect-ecs-prod:2021-12-22t20-31-42-592614-00-00>
built_storage.flows={}
Pedro Machado
12/22/2021, 8:35 PMAnna Geller
files={
REPO_BASE_DIR / "radar/radar/linkedinapi.py": "/modules/radar/linkedinapi.py",
REPO_BASE_DIR
/ "radar/radar/retry_strategy.py": "/modules/radar/retry_strategy.py",
}
it should rather be:
files={f"{REPO_BASE_DIR}/radar/radar/linkedinapi.py": "/modules/radar/linkedinapi.py",
f"{REPO_BASE_DIR}/radar/radar/retry_strategy.py": "/modules/radar/retry_strategy.py",
}
Anna Geller
Successfully built 32e27568f61d
Successfully tagged <account>.<http://dkr.ecr.us-east-1.amazonaws.com/prefect-ecs-prod:2021-12-22t20-31-42-592614-00-00|dkr.ecr.us-east-1.amazonaws.com/prefect-ecs-prod:2021-12-22t20-31-42-592614-00-00>
Anna Geller
storage.add_flow(flow)
and then:
built_storage = flow.storage.build(push=False)
print(f"{built_storage.flows=}")
Pedro Machado
12/22/2021, 8:46 PMREPO_BASE_DIR
is a Path
object.
When I add the flow, I get the error I was getting before. I have to go now but will check this later. Thanks for your help!
Traceback (most recent call last):
File "/home/pedro/clients/thrive/python_jobs/radar/radar/flows/li_company_flow.py", line 226, in <module>
built_storage = flow.storage.build(push=False)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/storage/docker.py", line 308, in build
self._build_image(push=push)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/storage/docker.py", line 339, in _build_image
dockerfile_path = self.create_dockerfile_object(directory=tempdir)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/storage/docker.py", line 474, in create_dockerfile_object
f.write(flow_to_bytes_pickle(self._flows[flow_name]))
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/prefect/utilities/storage.py", line 177, in flow_to_bytes_pickle
cloudpickle.dumps(flow, protocol=4), newline=False
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 73, in dumps
cp.dump(obj)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 602, in dump
return Pickler.dump(self, obj)
File "/home/pedro/.pyenv/versions/3.8.10/envs/radar/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 316, in _file_reduce
raise pickle.PicklingError(
_pickle.PicklingError: Cannot pickle files that are not opened for reading: a
Anna Geller
Pedro Machado
12/22/2021, 8:55 PMPedro Machado
12/23/2021, 2:50 AMlogger
object from loguru
at the top of the python file where the flow is defined. I got it to work by moving the import inside of the task. Is there a better approach?Anna Geller
stored_as_script=True
).