https://prefect.io logo
t

Tony

02/19/2021, 6:59 PM
I tried to do some searching for
flow not found
when using Docker storage, but having mixed results. I’m working on a CICD script for any arbitrary repo to loop through a directory of flow files, build container storage (can be many containers or a single one, i’m fine with either), and then register them. It starts by copying and building my repo into a container, then hopefully builds that container with all the prefect goodies, and sets the
run_config
:
Copy code
image_name = f"{git_repo}/prefect-{Path(flow_file).stem.lower()}"
        image_tag = f"githash-{git_commit}"
        # built by cicd agent
        base_image = f"{REGISTRY_URL}/{git_repo}/flows:{image_tag}"
        # second image built
        base_image_plus_prefect = f"{REGISTRY_URL}/{image_name}:{image_tag}"

         flow.storage = Docker(
            registry_url=REGISTRY_URL,
            base_image=base_image,
            image_name=image_name,
            image_tag=image_tag,
            env_vars=fetch_env_vars(os.environ["GIT_BRANCH"]),
            # here be the trouble
            # same path as my Dockerfile???
            path=f"/repo/{flow_file}",
            stored_as_script=True,
        )
        flow.storage.build()

        flow.run_config = DockerRun(
            image=base_image_plus_prefect,
            labels=fetch_agent_labels(os.environ["GIT_BRANCH"]),
        )
        ... cicd stuff ...
        flow.register()
I’m getting:
ValueError('Flow is not contained in this Storage')
when running this currently. First option I see is specifying
Docker(…, files={…})
yet some repos might have dozens or hundreds of extra files that they need to include, any chance
files
takes in wildcard pathing? Second option I see is [multi-flow storage], but then I run into CICD problems:
Copy code
File "/opt/prefect/healthcheck.py", line 151, in <module>
    flows = cloudpickle_deserialization_check(flow_file_paths)
  File "/opt/prefect/healthcheck.py", line 44, in cloudpickle_deserialization_check
    flows.append(cloudpickle.loads(flow_bytes))
ModuleNotFoundError: No module named 'flows.<my flow name>'
Anyone have some better ideas?
z

Zanie

02/19/2021, 7:05 PM
Hi! I'm a bit confused by what you're trying for here. If you want to ingress an arbitrary repo you can just • Find all flow files • Import the flow object from each file • For each flow object, set the flow storage to Docker • For each flow object, register the flow (which builds the storage by default)
It's also worth noting that
files
can take a folder so you can copy a bunch of files that way. For wildcarding you'd need to traverse the tree yourself.
👍 1
t

Tony

02/19/2021, 7:09 PM
Thanks, i’ll have to look at that folder option. Before I do, I think i’m doing what you’re saying with this code snippet, is there a problem with the parameter setup for
Docker
I have that’s resulting in the
ValueError()
?
z

Zanie

02/19/2021, 7:10 PM
Another option may be to build a docker image that contains the entire repo then just use
LocalStorage
that points to the path it each flow and a
DockerRun
using that single base image.
That error is confusing, I'll see if I can sort out what's happening.
Ah you're calling
flow.storage.build()
before
storage.add_flow(flow)
has been called
I would allow
flow.register
to build the storage for you, it takes care of adding the flow to the storage.
t

Tony

03/02/2021, 2:03 PM
Thanks, sorry I walked into the woods for a weeek after posting this. Due to company security policy I can't have internet access, and secrets on the same machine so while I would like to build with
flow.register()
it's not possible 😞
z

Zanie

03/02/2021, 2:19 PM
Then make sure you call
flow.storage.add_flow(flow)
before
build()
🙂
t

Tony

03/02/2021, 4:20 PM
Just to check my understanding - When the docker agent realizes it needs to run a flow. It pulls down the container, and from my best guess at reading the code executes:
prefect execute flow-run
. Can you help me understand what
flow-run
is here? I pulled down the container that prefect "created" and can get
python my_flow.py
to run, but can't seem to get it to run this execute thing:
Copy code
root@c8ece762e0cf:/repo# prefect execute flow-run
Not currently executing a flow within a Cloud context.
Traceback (most recent call last):
  File "/usr/local/bin/prefect", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/prefect/cli/execute.py", line 37, in flow_run
    raise Exception("Not currently executing a flow within a Cloud context.")
Exception: Not currently executing a flow within a Cloud context.
z

Zanie

03/02/2021, 4:24 PM
The flow run that should be executed is populated by the The flow run that should be executed is populated by the
Agent
when it creates the container (using environment variables
PREFECT__CONTEXT__FLOW_RUN_ID...
)
You should use a
DockerAgent
and the UI or the
create_flow_run
mutation to run the stored flow. I would not recommend downloading the generated container and trying to run the flow manually.
2 Views