https://prefect.io logo
Title
n

Nate Atkins

04/15/2020, 2:56 PM
When I run it with the LocalResultHandler the xform_data stage is more or less skipped and I get an incorrect result.
INFO - prefect.Task: write_data | {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
If I look in the log I see that the xform_data stage isn't run. Not sure why that happens, but also not sure how the data passed to write_data is the data that was passed to xform_data. It seems like the two sets of cached data are the same and xform_data sees the cached data from load_data and gets skipped.
April 15th 2020 at 8:45:25am | prefect.CloudTaskRunner
DEBUG 
Task 'xform_data': 1 candidate cached states were found
April 15th 2020 at 8:45:25am | prefect.LocalResultHandler
DEBUG 
Starting to read result from /tmp/inter/prefect-result-2020-04-15t14-45-25-356078-00-00...
April 15th 2020 at 8:45:25am | prefect.LocalResultHandler
DEBUG 
Finished reading result from /tmp/inter/prefect-result-2020-04-15t14-45-25-356078-00-00...
April 15th 2020 at 8:45:25am | prefect.CloudTaskRunner
DEBUG 
Task 'xform_data': Handling state change from Pending to Cached
April 15th 2020 at 8:45:25am | prefect.CloudTaskRunner
DEBUG 
Task 'xform_data': can't set state to Running because it isn't Pending; ending run.
April 15th 2020 at 8:45:25am | prefect.CloudTaskRunner
INFO 
Task 'xform_data': finished task run for task with final state: 'Cached'
April 15th 2020 at 8:45:25am | prefect.CloudTaskRunner
INFO 
Task 'write_data': Starting task run...
This seems to be the line that I don't expect.
Task 'xform_data': can't set state to Running because it isn't Pending; ending run.
j

josh

04/15/2020, 3:12 PM
I could be misinterpreting but I believe the reason why your
xform_data
task isn’t running is because you are choosing to cache it for 60 seconds. So if you run while that cache is still valid it won’t run the task again
n

Nate Atkins

04/15/2020, 4:16 PM
I have the same problem if I start a whole new local server. Also, the results returned from the cache don't show that xform has ever run on them.
INFO - prefect.Task: write_data | {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
When the xformed result should be
INFO - prefect.Task: write_data | {'col_1': [6, 4, 2, 0], 'col_2': ['a', 'b', 'c', 'd']}
j

josh

04/15/2020, 4:21 PM
Hm interesting. I’m running your snippet (without the json dump part) and it looks to be working for me
output.txt
n

Nate Atkins

04/15/2020, 4:26 PM
Yes, that looks like what I was expecting. I'm running 0.10.2, local server, on Ubuntu 18.04
j

josh

04/15/2020, 4:26 PM
Let me try a server run real quick
Yeah weird I am seeing the same behavior as you. Would you mind opening an issue for this?
n

Nate Atkins

04/15/2020, 4:36 PM
Can do. Thanks for confirming I'm not going nuts.
j

josh

04/15/2020, 5:07 PM
@Nate Atkins was debugging this a bit and I found that if you add a
cache_key
to the
xform_data
task then it will work as a temporary workaround. This is definitely a bug though in server, possibly something w/ the database keys. I also attempted this on cloud and everything worked as expected there
n

Nate Atkins

04/15/2020, 5:08 PM
Thanks for the quick workaround Josh.
I opened the issue as https://github.com/PrefectHQ/prefect/issues/2343 I also confirmed that setting the cache_key does resolve the problem.