Babaian Daniel
03/20/2025, 1:42 PMDaskTaskRunner
that spun up a local Dask cluster, but we are transitioning to using a separate Dask cluster.
The Setup:
• Most of the task and flow results are protobuf messages.
• I'm using caching to reuse outputs between runs.
The Issue:
When caching is enabled and I retrieve cached results, the outputs are unexpectedly being converted to DataFrames. This is causing issues because I expect them to remain in their original protobuf message format.
What I’ve Tried:
• Switching from a local DaskTaskRunner
to a remote Dask cluster.
• Ensuring that serialization is handled properly for protobuf messages.
My Questions:
1. Is there a known issue or behavior in Prefect 2.10.18 where cached results might be implicitly converted to DataFrames?
2. Are there specific steps to ensure protobuf messages are serialized/deserialized correctly when using caching and a remote Dask cluster?
3. Should I be using a custom serializer or adjusting how results are stored to prevent this conversion?
Any insights or guidance would be greatly appreciated. Let me know if I can provide additional context!
Thanks in advance!