Cab Maddux
03/23/2020, 2:59 PMA
containing task X
which has input arg1
, takes a long time to run and returns a string. It uses GCSResultHandler
, with cache_for=datatime.timedelta(hours=1)
and cache_validator=prefect.engine.cache_validators.all_inputs
:
1. A run of flow A
is triggered and the input for task X
is arg1='abc'
and the output is applebananacarrot
2. 5 minutes another run of flow A
is triggered and the input for task X
is arg1='def'
and the output is danceearflight
3. 5 minutes later another run of flow A
is triggered and the input for task X
is arg1=abc
After #1 we'll have a cached value for task X
and input abc
pointing to GCS URI storing pickled applebananacarrot
. After #2 we'll have and additional cached value for task X
and input def
pointing to GCS URI storing pickled danceearflight
. Then #3 will pull GCS URI for pickled applebananacarrot
from cache. Is that correct?josh
03/23/2020, 3:07 PMCab Maddux
03/23/2020, 3:10 PMjosh
03/23/2020, 3:18 PMabc
then it should pull the GCS URI for the pickled applebananacarrot
Cab Maddux
03/23/2020, 3:33 PMjosh
03/23/2020, 3:38 PMCab Maddux
03/23/2020, 7:50 PMjosh
03/24/2020, 8:54 PMCab Maddux
03/24/2020, 8:58 PMjosh
03/24/2020, 9:05 PMCab Maddux
03/24/2020, 9:08 PMjosh
03/24/2020, 9:10 PMPREFECT__FLOWS__CHECKPOINTING="true"
Cab Maddux
03/24/2020, 9:12 PMjosh
03/24/2020, 9:13 PMCab Maddux
03/25/2020, 1:25 AMjosh
03/25/2020, 3:30 AM- name: PREFECT__LOGGING__LEVEL
value: "DEBUG"
Cab Maddux
03/25/2020, 3:32 AMdef passthrough(val): return val
), but wondering what pattern you would suggest here (that way rather than downloading a file, I'm only deserializing JSON to check inputs for all cached states)josh
03/25/2020, 7:28 PMCab Maddux
03/25/2020, 8:11 PM