<@ULVA73B9P> how do I set extra pip packages in a ...
# ask-community
m
@Marvin how do I set extra pip packages in a managed deployment in the Flow.deploy() function in a python sdk?
m
thought for 71.4 seconds
In Prefect 3.x, for a Managed work pool you pass extra pip packages via the
job_variables
argument to
Flow.deploy()
. The key is
pip_packages
and it expects a list of pip requirement strings. Example:
Copy code
from prefect import flow

@flow
def my_flow():
    return "hello"

if __name__ == "__main__":
    my_flow.deploy(
        name="my-managed-deployment",
        work_pool_name="managed",
        job_variables={
            "pip_packages": [
                "pandas==2.2.2",
                "pyarrow>=15",
                "prefect-aws>=0.4.0",
            ],
        },
    )
Notes: -
pip_packages
entries can be any valid pip requirement specifier (pinned versions, ranges, extras). - These packages are installed into the managed runtime before your flow runs. Docs for more details: - Managed work pool - Customize job variables
m
@Marvin I'm using a deployment with the following but it crashes, no relevant logs besides
Flow run infrastructure exited with non-zero status code:
Essential container in task exited (Error Code: 1)
Copy code
{
  "pip_packages": [
    "dataflows@git+<https://x-access-token:XXXXXX@github.com/XXXXX/XXXXX.git@main#subdirectory=XXXX>"
  ]
}
m
thought for 176.8 seconds
Thanks for sharing that snippet — a couple of things in there can cause the managed run to fail before you see useful logs. What to fix first - Remove the angle brackets. Slack wraps URLs like <https://...>, but those literal
<
and
>
will break pip parsing if copied into your
job_variables
. - Use valid PEP 508 syntax for VCS requirements. It should be
name @ git+https://...
(note the space on both sides of
@
), or the legacy
#egg=
form. - Make sure your token is URL-encoded (tokens sometimes contain characters that must be percent-encoded) and has access to the repo. - Prefer pinning to a commit SHA instead of a branch for reliability. Working examples - PEP 508 style (recommended):
Copy code
{
  "pip_packages": [
    "dataflows @ git+<https://x-access-token:<YOUR_TOKEN>@github.com/OWNER/REPO.git@main#subdirectory=path/to/pkg>"
  ]
}
- Legacy
#egg=
style (also valid):
Copy code
{
  "pip_packages": [
    "git+<https://x-access-token:<YOUR_TOKEN>@github.com/OWNER/REPO.git@main#egg=dataflows&subdirectory=path/to/pkg>"
  ]
}
Notes - Do not include literal angle brackets in the actual value. Only replace
<YOUR_TOKEN>
with your real token (URL-encoded). - If you’re using a GitHub App installation token, the
x-access-token:<TOKEN>
pattern is fine. If you’re using a PAT,
https://<USERNAME>:<TOKEN>@github.com/...
also works. - If you continue to see failures, the most robust alternative is to build a custom image with this dependency pre-installed and pass it to
Flow.deploy(image=...)
. This avoids VCS installs at runtime entirely. If it still fails: - Could you paste the exact
pip_packages
entry you’re using (with the token redacted), making sure it’s without Slack’s
< >
markup? - Is the repo private? If so, confirm the token scopes (read access to that repo). - If you can, try swapping
@main
for a commit SHA to rule out branch-related changes. Related docs: - Managed work pool - Customize job variables
m
it still fails, if only I could see some logs from the container it's running on it'd be so much helpful