Daniel Sääf
05/25/2022, 12:29 PM_pickle.PicklingError: Pickling client objects is explicitly not supported.
Clients have non-trivial state that is local and unpickleable.
After reading up on this it sounds like this is due to some other exception that results in an unpickleable object. Is that right? Are there any ways i can reach the underlying exception?
Traceback and code in thread..12:21:52.421 | INFO | prefect.engine - Created flow run 'rugged-capuchin' for flow 'test-flow'
12:21:52.421 | INFO | Flow run 'rugged-capuchin' - Using task runner 'ConcurrentTaskRunner'
12:21:52.756 | ERROR | Flow run 'rugged-capuchin' - Encountered exception during execution:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 445, in orchestrate_flow_run
with partial_flow_run_context.finalize(
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/collections.py", line 325, in finalize
return self.model_cls(**self.fields, **kwargs)
File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
File "pydantic/main.py", line 1038, in pydantic.main.validate_model
File "pydantic/fields.py", line 857, in pydantic.fields.ModelField.validate
File "pydantic/fields.py", line 1074, in pydantic.fields.ModelField._validate_singleton
File "pydantic/fields.py", line 1121, in pydantic.fields.ModelField._apply_validators
File "pydantic/class_validators.py", line 313, in pydantic.class_validators._generic_validator_basic.lambda12
File "pydantic/main.py", line 679, in pydantic.main.BaseModel.validate
File "pydantic/main.py", line 605, in pydantic.main.BaseModel._copy_and_set_values
File "/usr/local/lib/python3.10/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/usr/local/lib/python3.10/copy.py", line 231, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/local/lib/python3.10/copy.py", line 161, in deepcopy
rv = reductor(4)
File "/usr/local/lib/python3.10/site-packages/google/cloud/client/__init__.py", line 194, in __getstate__
raise PicklingError(
_pickle.PicklingError: Pickling client objects is explicitly not supported.
Clients have non-trivial state that is local and unpickleable.
12:21:52.926 | ERROR | Flow run 'rugged-capuchin' - Finished in state Failed('Flow run encountered an exception.')
This happens when i’m trying to run a simple test flow:
from prefect import task, flow, get_run_logger
@flow
def test_flow():
print("flow starting")
res1 = running_task1()
return running_task2(res1)
@task
def running_task1():
logger =get_run_logger()
<http://logger.info|logger.info>("Running task1")
return "task 1"
@task
def running_task2(input):
logger =get_run_logger()
<http://logger.info|logger.info>(f"Running task2 with {input}")
return "task 2"
if __name__=="__main__":
test_flow()
Anna Geller
05/25/2022, 12:50 PM14:49:47.823 | INFO | prefect.engine - Created flow run 'vivacious-parrot' for flow 'ex-flow'
14:49:47.823 | INFO | Flow run 'vivacious-parrot' - Using task runner 'ConcurrentTaskRunner'
flow starting
14:49:48.344 | INFO | Flow run 'vivacious-parrot' - Created task run 'running_task1-c5e0b2aa-0' for task 'running_task1'
14:49:48.564 | INFO | Flow run 'vivacious-parrot' - Created task run 'running_task2-8707b00f-0' for task 'running_task2'
14:49:48.645 | INFO | Task run 'running_task1-c5e0b2aa-0' - Running task1
14:49:48.810 | INFO | Task run 'running_task1-c5e0b2aa-0' - Finished in state Completed()
14:49:48.976 | INFO | Task run 'running_task2-8707b00f-0' - Running task2 with task 1
14:49:49.141 | INFO | Task run 'running_task2-8707b00f-0' - Finished in state Completed()
14:49:49.311 | INFO | Flow run 'vivacious-parrot' - Finished in state Completed('All states completed.')
Process finished with exit code 0
prefect version
might be helpful
if nothing else helps, I'd suggest:
1. Creating a new virtual environment on your agent machine
2. Running locally on that machine to validate it's working
3. Then recreate a deployment and while doing it, explicitly point at this new fresh virtual environment e.g.:
DeploymentSpec(
name="ex-flow-dev",
flow=your_flow_name,
flow_runner=SubprocessFlowRunner(condaenv="yourVenv"),
schedule=IntervalSchedule(interval=timedelta(minutes=15)),
)
Daniel Sääf
05/25/2022, 1:06 PMroot@agent:/app# prefect version
Version: 2.0b4
API version: 0.3.1
Python version: 3.10.4
Git commit: 344acb00
Built: Thu, May 12, 2022 5:28 PM
OS/Arch: linux/x86_64
Profile: default
Server type: hosted
Kevin Kho
05/25/2022, 2:30 PMDaniel Sääf
05/26/2022, 6:08 AM❯ prefect version
Version: 2.0b5
API version: 0.3.1
Python version: 3.10.4
Git commit: 7b27c7cf
Built: Tue, May 17, 2022 4:54 PM
OS/Arch: darwin/x86_64
Profile: prod
Server type: hosted
First of all, the issue @Kevin Kho is mentioning seems to be happening also for the concurrent taskrunner. In my example i’ve only used the default task runner - and i get the same error as mentioned in the issue.
So basically, i have the same flow as defined above. If i set the prefect storage to point to a local storage - everything runs fine.
If change my default storage to a google cloud storage - it fails with the error message printed before.
To troubleshoot that i have the right permissions to the default bucket i added the following code to the test_flow.py
“sc_prefect-3” is the bucket i’ve set to default storage.
if __name__=="__main__":
c = storage.Client()
b = c.get_bucket("sc_prefect-3")
print(b)
blob_list = list(c.list_blobs("sc_prefect-3"))
blob = blob_list[0]
with open('file-to-download-to', 'wb') as file_obj:
c.download_blob_to_file(blob, file_obj)
print("downloaded completed")
blob = b.blob('myfile')
blob.upload_from_filename('file-to-download-to')
print("upload completed")
test_flow()
The list, read and write works well against the bucket - so it shouldn’t be a permission issue - but rather something that goes wrong when running the flow against the google cloud storage.
Are there anything i am missing when it comes to permissions/setup etc?
Would be great if i was able to use the google cloud storage - so let me know if i can do anything else to troubleshoot this further?Kevin Kho
05/30/2022, 4:35 PMDaniel Sääf
05/30/2022, 4:50 PMKevin Kho
05/30/2022, 6:05 PM