Unrelated topic, I’m having a bit of trouble using...
# ask-community
w
Unrelated topic, I’m having a bit of trouble using S3 storage for a flow. Nothing seems to get created at the key I specify, but I’m not getting any errors.
Copy code
% python demo_flow.py
[2021-08-21 19:26:39-0400] INFO - prefect.FlowRunner | Beginning Flow run for 'increment a random sample'
[2021-08-21 19:26:39-0400] INFO - prefect.DaskExecutor | Creating a new Dask cluster with `dask_kubernetes.core.KubeCluster`...
Creating scheduler pod on cluster. This may take some time.
Forwarding from 127.0.0.1:61151 -> 8786
Forwarding from [::1]:61151 -> 8786
Handling connection for 61151
Handling connection for 61151
/Users/wilson.bilkovich/.pyenv/versions/3.9.6/envs/addemart/lib/python3.9/site-packages/distributed/client.py:1105: VersionMismatchWarning: Mismatched versions found

+-------------+-----------+-----------+---------+
| Package     | client    | scheduler | workers |
+-------------+-----------+-----------+---------+
| blosc       | None      | 1.10.2    | None    |
| dask        | 2021.08.0 | 2021.08.1 | None    |
| distributed | 2021.08.0 | 2021.08.1 | None    |
| lz4         | None      | 3.1.3     | None    |
+-------------+-----------+-----------+---------+
  warnings.warn(version_module.VersionMismatchWarning(msg[0]["warning"]))
Handling connection for 61151
[2021-08-21 19:27:05-0400] INFO - prefect.DaskExecutor | The Dask dashboard is available at <http://localhost:8787/status>
INFO:prefect.DaskExecutor:The Dask dashboard is available at <http://localhost:8787/status>
Handling connection for 61151
Handling connection for 61151
Handling connection for 61151
Handling connection for 61151
[2021-08-21 19:27:25-0400] INFO - prefect.FlowRunner | Flow run SUCCESS: all reference tasks succeeded
INFO:prefect.FlowRunner:Flow run SUCCESS: all reference tasks succeeded
k
Am a bit confused by the logs here. Is this just
flow.run()
? What happens when you register?
You can also try doing
Copy code
flow.storage.build()
But actually I think the Flow is not uploaded for you if you use the key. It’s assumed that it’s already uploaded
Read the docstring for
local_script_path
here . I think that will clarify the behavior?
w
Aha, I see. What does build() do?
Ok, so if I set a key and do ‘stored_as_script = True’, and set the script path to the point at the current file, it should upload it?
k
build does the serialization/uploading to the specified storage. Registration does this also but doing the
.build()
is a good way to test.
Yes that’s my understanding from the docs
w
Hmm. build() seems to silently succeed but I still don’t end up with anything stored in S3 at the key I specified. Here’s my code: https://gist.github.com/wilson/19064256d3c5930fa4b989b6d89c4de6
k
Will try this myself a bit later tonight. That does seem right though
w
Cool, thanks for looking
k
Ok I went through the code and tried it myself. The correct usage is:
Copy code
storage.add_flow(f)
storage.build()
But your code is right and if you do
flow.register("project_name")
, the storage adds the flow and is built so these are called for you so you don’t need those lines (I think you know but just making sure)