Hello! We're currently building a POC using prefec...
# prefect-cloud
j
Hello! We're currently building a POC using prefect and are able to run our flow locally, triggered via prefect-cloud. We'd like to test out the managed compute pools but getting an error:
prefect.exceptions.ScriptError: Script at 'pipeline.py' encountered an exception: FileNotFoundError(2, 'No such file or directory')
After some research it seems like managed compute only supports using referenced git repos at this time. Our team uses a private bitbucket repo. Is this setup possible? Thanks!
a
@Jake Kaplan is that right?
j
Hey you should be able to use any pull step on a deployment, this includes reading from bitbucket: https://docs.prefect.io/latest/guides/prefect-deploy/?h=#the-pull-action
j
And that's supported with the managed worker pool? I can give that a shot. Is there an e2e example somewhere to reference?
j
If you're following the quick start tutorial the part you'd want to replace is here: https://docs.prefect.io/latest/guides/prefect-deploy/?h=#store-your-code-in-git-based-cloud-storage:
Copy code
from prefect import flow
from prefect.runner.storage import GitRepository

if __name__ == "__main__":
    flow.from_source(
        source=GitRepository(
        url="your bitbucket url"
        credentials={
            "access_token": "..."
        }
    ),
    entrypoint="my_gh_workflow.py:repo_info",
    ).deploy(
        name="my-first-deployment",
        work_pool_name="my-managed-pool",
        cron="0 1 * * *",
    )
👍 1
j
Got everything loading. Had to convert to a BitBucketRepository from GitRepo due to an issue with converting a block to a secret? Latest issues is:
File "/Users/jasonwilson/Library/Caches/pypoetry/virtualenvs/ml-analytics-dev-MZJlIWH2-py3.12/lib/python3.12/site-packages/prefect/deployments/runner.py", line 876, in deploy
await deployment.apply(image=image_ref, work_pool_name=work_pool_name)
File "/Users/jasonwilson/Library/Caches/pypoetry/virtualenvs/ml-analytics-dev-MZJlIWH2-py3.12/lib/python3.12/site-packages/prefect/deployments/runner.py", line 272, in apply
[self.storage.to_pull_step()] if self.storage else []
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jasonwilson/Library/Caches/pypoetry/virtualenvs/ml-analytics-dev-MZJlIWH2-py3.12/lib/python3.12/site-packages/prefect/runner/storage.py", line 550, in to_pull_step
raise BlockNotSavedError(
prefect.blocks.core.BlockNotSavedError: Block must be saved with
.save()
before it can be converted to a pull step.
Code:
if name == "__main__":
flow.from_source(
source=BitBucketRepository(
repository="https://bitbucket.org/REDACTED.git",
reference="jwil/add-prefect-setup",
bitbucket_credentials = BitBucketCredentials.load("prefect-bitbucket-creds")
),
entrypoint="./workflows/src/workflows/pipeline.py:jwil_test",
).deploy(
name="jason-deploy-from-git",
work_pool_name="ManagedPrefectTestWorkPool",
cron="0 1 * * *",
)
j
do you have a BitBucketCredentials block named
prefect-bitbucket-creds
? if not if you run the following you'll create a block in your workspace:
Copy code
BitBucketCredentials(
        token="...",
        username="...",
        password="...",
        url="...",
    ).save(name="prefect-bitbucket-creds")
or create one in the UI on the blocks page
j
Yes, I created that earlier today
Out of curiosity I changed the block name in code and it fails with a lookup error so it's able to see the original block.
j
Apologies sorry, I am not the most familiar with using bitbucket as a source. took me a second to get a working test setup
I believe you need to save
BitBucketRepository
as well with the
BitBucketCredentials
so it after saving your creds and then saving your repo w/ your creds attached, it would look like this:
Copy code
my_flow.from_source(
        BitBucketRepository.load("my-bitbucket-repo"),
        entrypoint="my_flow.py:my_flow"
    ).deploy(
        name="my-first-deployment",
        work_pool_name="my-modal-pool"
    )
j
How can I provide a custom branch with the setup you provided? You'll notice I'm using the
reference
param in the provided sample. When I try adding back I get:
TypeError: Flow.from_source() got an unexpected keyword argument 'reference'
j
reference should be set on the bitbucket repo block, you’ll notice in my example the branch is
mian
because I typoed it 😅
j
ahh i totally missed it in the popup. Thanks!
Progress, now I'm seeing this when actually trying to run in the console: KeyError: "No class found for dispatch key 'bitbucket-repository' in registry for type 'Block'."
Not seeing that referenced in code anywhere 🤔
j
🤔 do you have prefect-bitbucket installed? I assumed thats where you were importing the blocks from
pip install prefect-bitbucket
j
Yeah, we're using poetry...
The following packages are already present in the pyproject.toml and will be skipped:
- prefect-bitbucket
j
If you could share your deployment snippet and the traceback I can take a look?
j
Flow could not be retrieved from deployment.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 154, in run_steps
step_output = await run_step(step, upstream_outputs)
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 125, in run_step
result = await from_async.call_soon_in_new_thread(
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 327, in aresult
return await asyncio.wrap_future(self.future)
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 389, in _run_async
result = await coro
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/pull.py", line 182, in pull_with_block
block = await Block.load(full_slug)
File "/usr/local/lib/python3.10/site-packages/prefect/client/utilities.py", line 78, in with_injected_client
return await fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 841, in load
return cls._from_block_document(block_document)
File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 634, in _from_block_document
else cls.get_block_class_from_schema(block_document.block_schema)
File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 688, in get_block_class_from_schema
return cls.get_block_class_from_key(block_schema_to_key(schema))
File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 699, in get_block_class_from_key
return lookup_type(cls, key)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/dispatch.py", line 185, in lookup_type
raise KeyError(
KeyError: "No class found for dispatch key 'bitbucket-repository' in registry for type 'Block'."
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 426, in retrieve_flow_then_begin_flow_run
else await load_flow_from_flow_run(flow_run, client=client)
File "/usr/local/lib/python3.10/site-packages/prefect/client/utilities.py", line 78, in with_injected_client
return await fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/deployments.py", line 282, in load_flow_from_flow_run
output = await run_steps(deployment.pull_steps)
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 182, in run_steps
raise StepExecutionError(f"Encountered error while running {fqn}") from exc
prefect.deployments.steps.core.StepExecutionError: Encountered error while running prefect.deployments.steps.pull_with_block
081637 PM
prefect.flow_runs
INFO
Process for flow run 'adept-slug' exited cleanly.
Here's the stacktrace, thanks!
To be clear this is the error I'm getting in the console when running the deployment.
j
ah i'm so sorry this is when reading the deployment! I thought you were still seeing the errors locally and was confused. my fault!
so
prefect-bitbucket
is not installed on the container but there a ways to get it installed
j
Can we run poetry install on the container or during pull?
j
yes! however your toml file i'm assuming is in your repo, so you'll need to add this to the work pool:
j
I don't have a toml file since I'm using the code first approach
Ok, that fixed the issue and my basic "hello world" app is working! If I want to add poetry install, where would that go?
Can I add it to the python setup from above or does it need to be done using a deployment.yaml?
j
nice! okay 1 more hurdle out of the way
if you did want to use poetry install you would need to use
prefect deploy
and add it to your pull steps in
prefect.yaml
j
Ok, I was looking into that but the docs were a little confusing. Some reference deployment.yaml, and when I tried that the cli said it was deprecated.
Can you point me to a minimal working prefect.yaml?
j
Copy code
# Welcome to your prefect.yaml file! You can use this file for storing and managing
# configuration for deploying your flows. We recommend committing this file to source
# control along with your flow code.

# Generic metadata about this project
name: myproject
prefect-version: v2.14.6

# build section allows you to manage and build docker images
build:

# push section allows you to manage if and how this project is uploaded to remote locations
push:

# pull section allows you to provide instructions for cloning this project in remote locations
pull:
- prefect.deployments.steps.git_clone:
    repository: <https://bitbucket.org/jakesworkspace/myproject.git>
    branch: mian
    access_token: # <your-access-token>
- prefect.deployments.steps.run_shell_script:
    id: install-poetry
    script: pip install poetry
- prefect.deployments.steps.run_shell_script:
    id: install-dependencies
    script: # your poetry command here

# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: my-bitbucket-deploy
  version:
  tags: []
  description:
  entrypoint: my_flow.py:my_bitbucket_flow
  parameters: {}
  work_pool:
    name: my-managed-pool
    work_queue_name:
    job_variables: {}
  schedule:
if you run
prefect deploy
you should see those pull steps on your deployment
we support pip install as it's own out of the box step, poetry you'll need to run the commands directly. I am not super familiar with poetry but you may need to specify the dependencies directly into the system python, since that's the interpreter you are in at runtime
j
Thanks. Do you have a link to the docs for this? I've been having trouble finding consistent information
It can't seem to pull down the git repo now... similar to what we saw yesterday before switching to a bitbucket repo. Does this look correct?
- prefect.deployments.steps.git_clone:
repository: https://bitbucket.org/REDACTED.git
branch: jwil-add-prefect-setup
credentials: "{{ prefect.blocks.bitbucket-credentials.prefect-bitbucket-creds }}"
This is the stacktrace:
Flow could not be retrieved from deployment.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 154, in run_steps
step_output = await run_step(step, upstream_outputs)
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 125, in run_step
result = await from_async.call_soon_in_new_thread(
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 327, in aresult
return await asyncio.wrap_future(self.future)
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 352, in _run_sync
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 259, in coroutine_wrapper
return call()
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 432, in call
return self.result()
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 318, in result
return self.future.result(timeout=timeout)
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 179, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 389, in _run_async
result = await coro
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/pull.py", line 123, in git_clone
await storage.pull_code()
File "/usr/local/lib/python3.10/site-packages/prefect/runner/storage.py", line 233, in pull_code
await self._clone_repo()
File "/usr/local/lib/python3.10/site-packages/prefect/runner/storage.py", line 261, in _clone_repo
raise RuntimeError(
RuntimeError: Failed to clone repository 'https://bitbucket.org/redacted.git' with exit code 128.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 426, in retrieve_flow_then_begin_flow_run
else await load_flow_from_flow_run(flow_run, client=client)
File "/usr/local/lib/python3.10/site-packages/prefect/client/utilities.py", line 78, in with_injected_client
return await fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/deployments.py", line 282, in load_flow_from_flow_run
output = await run_steps(deployment.pull_steps)
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 182, in run_steps
raise StepExecutionError(f"Encountered error while running {fqn}") from exc
prefect.deployments.steps.core.StepExecutionError: Encountered error while running prefect.deployments.steps.git_clone
122855 PM
prefect.flow_runs
INFO
Process for flow run 'illustrious-whippet' exited cleanly.
There's no load events for the block so it doesn't seem like it's getting loaded...
j
that looks right to me, I am able get this to clone with using the credentials block:
Copy code
- prefect.deployments.steps.git_clone:
      repository: <https://bitbucket.org/jakesworkspace/myproject.git>
      branch: mian
      credentials: "{{ prefect.blocks.bitbucket-credentials.bitbucket-creds }}"
and the access token directly:
Copy code
- prefect.deployments.steps.git_clone:
      repository: <https://bitbucket.org/jakesworkspace/myproject.git>
      branch: mian
      access_token: "{{ prefect.blocks.bitbucket-credentials.bitbucket-creds.token }}"
can you try using an access token directly instead of a block? that may help debug
j
Hmmm I tried both of those techniques with the same result. Can you share are screenshot of your credentials setup please?
j
j
Is your branch really named
mian
?
j
it's just the name and the token
it unfortunately is, I typo'd on creation
j
Curiously this credentials block is referenced by the bitbucket repo block we setup yesterday and works.
j
can you go to your deployment and check the configuration tab?
j
Yes, it looks the same
Let me try running with a hardcoded value
👍 1
Aight, I had a typo in my branch name too! Maybe exposing the underlying error could help w debugging.
Now I'm getting this error: Flow could not be retrieved from deployment. Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 154, in run_steps step_output = await run_step(step, upstream_outputs) File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 125, in run_step result = await from_async.call_soon_in_new_thread( File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 327, in aresult return await asyncio.wrap_future(self.future) File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 389, in _run_async result = await coro File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/utility.py", line 164, in run_shell_script commands = script.splitlines() AttributeError: 'NoneType' object has no attribute 'splitlines' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 426, in retrieve_flow_then_begin_flow_run else await load_flow_from_flow_run(flow_run, client=client) File "/usr/local/lib/python3.10/site-packages/prefect/client/utilities.py", line 78, in with_injected_client return await fn(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/prefect/deployments/deployments.py", line 282, in load_flow_from_flow_run output = await run_steps(deployment.pull_steps) File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 182, in run_steps raise StepExecutionError(f"Encountered error while running {fqn}") from exc prefect.deployments.steps.core.StepExecutionError: Encountered error while running prefect.deployments.steps.run_shell_script
Pretty sure its related to a commented out line... one sec
Ok, cool now I'm onto poetry errors so I think we're good to go! Thanks for working through this
I think the working directory is wrong... looks like things are cloned to /opt/prefect/...?
Poetry could not find a pyproject.toml file in /opt/prefect or its parents
j
/opt/prefect
I think is the current dir, so your poetry command should reference the cloned repo inside of that. for example how it's done w/ the pip install step:
Copy code
pull:
    - prefect.deployments.steps.git_clone:
        id: clone-step  # needed in order to be referenced in subsequent steps
        repository: <https://github.com/org/repo.git>
    - prefect.deployments.steps.pip_install_requirements:
        directory: {{ clone-step.directory }}  # `clone-step` is a user-provided `id` field
        requirements_file: requirements.txt
alternative I think you can run another shell command to change the directory to the newly clone repo?
sorry I am not super familiar with the poetry syntax, does it have to be run at the same level as your repo/toml?
j
yeah, ours is unfortunately in a nested folder.
- prefect.deployments.steps.run_shell_script:
directory: "{{ clone-step.directory }}/workflows/src"
id: install-dependencies
script: poetry install
This is evaluating to just "workflows/src", no directory prefix. Does that look correct?
i also tried hardcoding `/opt/prefect/...' but that also failed
j
you need to add the id to your previous step maybe?
Copy code
- prefect.deployments.steps.git_clone:
    id: clone-step
so
{{ clone-step.directory }}
can resolve correctly
j
oh duh, ok that wasn't obvious
j
totally my fault, I honestly forgot that was needed
j
Failed due to a
RuntimeError
- The configured memory limit was reached during the flow run, causing a non-zero status code.
This is just a hello world example so assuming it crashed during poetry install
j
I think you're correct about poetry install 😅 so for prefect-managed there is a 2gb memory footprint limit: https://docs.prefect.io/latest/guides/managed-execution/?h=mana#resources
the good news though is everything you've already done is applicable and ready to go for any other work pool (infra) you choose. it's totally swappable
j
Oh man... I've come so far... yet so far away haha
Is there any way to get a memory quota increase at this time? I can go in the other direction just going to be more scrutiny from a security perspective.
j
Unfortunately not at this time. Your can think of the work pool options available to you like: •
managed_pool
- prefect takes care of both the deployment and execution of your code (inside of prefect cloud) •
push_pool
- prefect cloud is handling the deployment (by creating runs in your infra). your infra handles code execution. •
hybrid_pools
- you run a worker in your infra env to poll for work to deploy. your infra handles code execution. depending on what you choose it might translate to less scrutiny. • heres a guide for getting started with push pools • and for using a hybrid pool with a worker
👀 1