https://prefect.io logo
Join Slack
Powered by
# ask-community
  • a

    aadi i

    09/03/2025, 7:17 PM
    How can I dynamically pass parameters to a Prefect flow running as a Kubernetes Job (Pod) or Docker container? I’m currently using the
    flow.from_source()
    method, which downloads the flow code from an S3 bucket, builds the image dynamically, and then runs the flow. However, I’d like to avoid building the Docker image at runtime. Is there a way to use a prebuilt Docker image (pulled from a registry) and still pass parameters dynamically — preferably through a Pythonic method or REST API — without relying on deployment templates or environment variables? I understand that
    job_variables
    can be set dynamically and overridden during a flow run, but I’m looking for an alternative that allows passing parameters (like
    flow_data
    ) more flexibly — ideally at runtime — in a way that automatically maps values from
    flow_data
    into
    job_variables
    and passes them as arguments to the flow entrypoint, while using a pre-built Docker image.
    Copy code
    flow_from_source = await flow.from_source(
        source=s3_bucket_block,
        entrypoint="flows/bill_flow.py:bill_assessment_flow"
    )
    
    flow_dependencies = get_flow_dependencies()
    
    deployment = await flow_from_source.deploy(
        name=PREFECT_DEPLOYMENT_NAME,
        tags=["billing"],
        work_pool_name="kubernetes-pool",
        schedule=None,
        push=False,  # Skip pushing image
        job_variables={
            "finished_job_ttl": 100,
            # "image": "mat/prefect-k8s-worker:15",  # Uncomment to use a custom prebuilt image
            "namespace": "prefect",
            "env": {
                "PREFECT_API_URL": "<http://prefect-server:4200/api>",
                "EXTRA_PIP_PACKAGES": flow_dependencies,
                "PYTHONPATH": "/opt/prefect/"
            }
        }
    )
    
    app.state.deployment_id = deployment
    
    flow_run = await client.create_flow_run_from_deployment(
        deployment_id=request.app.state.deployment_id,
        tags=run_tags,
        parameters={
            "flow_data": {
                "source_provider": source_provider,
                "target_provider": target_provider,
                "company_id": company_id,
                "company_name": company_name,
                "assessment_task_id": assessment_task_id
            }
        }
    )
    
    <http://logger.info|logger.info>(f"Created flow run with ID: {flow_run.id}")
    n
    m
    m
    • 4
    • 22
  • t

    Trey Gilliland

    09/04/2025, 12:11 AM
    I'm running into an issue where around 30% of my prefect runs for one my deployments fails at the git_clone and set_working_directory step because the directory where the code would be cloned into does not exist. This is especially weird as it works most of the time, but I guess 30% of the time the clone fails? I tried adding a shell script to sleep in between the clone and set working directory in case it was due to some race condition from the cloning, to no avail. When I add a debug step to run
    ls /
    , on successful runs the code is cloned to the right place under
    /code-main
    and on failed runs that directory is not there. There is no logs from the git clone step to suggest that the clone fails. The gh token is valid and it does work most of the time. Any ideas? This is on Prefect Cloud using a Modal Work Pool
    m
    • 2
    • 6
  • k

    Kiran

    09/04/2025, 7:26 AM
    hey @Marvin, if iam using retires and not retry_delay_seconds parameter with flow decorator, does retires work ?
    m
    • 2
    • 18
  • s

    Shareef Jalloq

    09/04/2025, 8:17 PM
    Hi all, new Prefect user here. I've managed to set up a simple deployment and can get it running as a dev instance using SQLite. Now I'm trying to run it under supervisord and connect to a Postgresql server. No matter what I do I can't get the server to start successfully under supervisord. I have a simple bash script that sets up the env and runs the server. I have an
    apps
    user that is used to run all apps on this server.
    Copy code
    web-server-01:/var/log/apps# cat /opt/apps/fpga-automation/start_prefect.sh
    #!/bin/bash
    source /opt/apps/fpga-automation/venv/bin/activate
    export PREFECT_API_DATABASE_CONNECTION_URL="<postgresql+asyncpg://prefect_user:prefectdb@10.10.8.114:5432/prefect_automation>"
    export PREFECT_HOME="/opt/apps/prefect"
    export HOME="/home/apps"
    export PREFECT_LOGGING_LEVEL=DEBUG
    exec prefect server start --host 127.0.0.1 --port 4200
    And the error looks like a DNS failure?
    Copy code
    ___ ___ ___ ___ ___ ___ _____
    | _ \ _ \ __| __| __/ __|_   _|
    |  _/   / _|| _|| _| (__  | |
    |_| |_|_\___|_| |___\___| |_|
    
    Configure Prefect to communicate with the server with:
    
        prefect config set PREFECT_API_URL=<http://127.0.0.1:4200/api>
    
    View the API reference documentation at <http://127.0.0.1:4200/docs>
    
    Check out the dashboard at <http://127.0.0.1:4200>
    
    
    
    20:13:29.728 | ERROR   | prefect.server.utilities.postgres_listener - Failed to establish raw asyncpg connection for LISTEN/NOTIFY: [Errno -2] Name does not resolve
    Traceback (most recent call last):
      File "/opt/apps/fpga-automation/venv/lib/python3.12/site-packages/prefect/server/utilities/postgres_listener.py", line 71, in get_pg_notify_connection
        conn = await asyncpg.connect(**connect_args)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/opt/apps/fpga-automation/venv/lib/python3.12/site-packages/asyncpg/connection.py", line 2421, in connect
        return await connect_utils._connect(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/opt/apps/fpga-automation/venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 1075, in _connect
        raise last_error or exceptions.TargetServerAttributeNotMatched(
      File "/opt/apps/fpga-automation/venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 1049, in _connect
        conn = await _connect_addr(
               ^^^^^^^^^^^^^^^^^^^^
      File "/opt/apps/fpga-automation/venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 886, in _connect_addr
        return await __connect_addr(params, True, *args)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/opt/apps/fpga-automation/venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 931, in __connect_addr
        tr, pr = await connector
                 ^^^^^^^^^^^^^^^
      File "/opt/apps/fpga-automation/venv/lib/python3.12/site-packages/asyncpg/connect_utils.py", line 802, in _create_ssl_connection
        tr, pr = await loop.create_connection(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "uvloop/loop.pyx", line 1982, in create_connection
    socket.gaierror: [Errno -2] Name does not resolve
    If I run the script as root or the apps user it works fine. What am I missing?
    n
    • 2
    • 5
  • n

    Nate

    09/04/2025, 9:37 PM
    Hi folks! we've identified and rolled out a fix related to the failure to release deployment concurrency slots in Prefect Cloud the fix should now be in place, but if you're still encountering anything unexpected, please let us know!
  • o

    Owen Boyd

    09/04/2025, 10:25 PM
    I'm seeing this task fail and then not retry - any idea what might be going on or how I can troubleshoot? The failure is likely caused by a downstream resource being overloaded. I'm confused why we are not seeing any retry attempts half an hour later. I suspected that the prefect run might have crashed, but it's still going according to the cloud UI. Nothing else that I'm aware of would prevent the retry from happening - I checked task concurrency limits etc, we don't have any internal concurrency limits that would prevent retry here in our code.
    Copy code
    Task run failed with exception: TaskRunTimeoutError('Scope timed out after 60.0 second(s).') - Retry 1/3 will start 10 second(s) from now 02:29:37 PM
    Finished in state Completed() 02:29:22 PM
  • m

    Mark Callison

    09/05/2025, 3:00 PM
    I am running into an issue where Deployment Runs are stuck in AwaitingConcurrencySlot even though nothing else is running. I have Concurrency set to 1, and it has always worked as intended before. If I set the Concurrency to 2, they start working again, but that doesn't explain why it thinks something else is already running. Any thoughts?
    z
    • 2
    • 7
  • n

    Nick Torba

    09/07/2025, 12:34 AM
    Hello prefect team, I am having a lot of trouble with one of our ecs push work pools. I commented on a related issue here: https://github.com/PrefectHQ/prefect/issues/18429#issuecomment-3263294213 Each night, we have a wave of ~300 runs get submitted. They all end up running many hours late because all the runs sit in LATE status. The concurrency limit on the work pool is 20, and the deployments don't have concurrency limits. I don't think it is an issue with ECS infra, because I can't see any logs, it looks like the runs are never being submitted at all. But eventually, they do. It seems like maybe something on the prefect backend is telling cloud not to submit them? But I am not sure where to look to get this information if it is something on our ECS side that could be causing this.
    n
    y
    • 3
    • 18
  • k

    Kiran

    09/08/2025, 12:19 PM
    Hey @Marvin prefect.flow_runs Finished in state AwaitingRetry('FLOW_FAILURE:OOM:Process exited with code 137 — SIGKILL detected (likely due to Out Of Memory or external kill).', type=SCHEDULED) , prefect.exceptions.UnfinishedRun: Run is in SCHEDULED state, its result is not available. thsi is from my flow run log, Iam failing my flow with return Failed(('FLOW_FAILURE:OOM:Process exited with code 137 — SIGKILL detected (likely due to Out Of Memory or external kill). upon hitting a conidtion, and i have added retries , why am i seeinga awaiting retry and unfinished run errror
    m
    • 2
    • 2
  • m

    Michael Savarese

    09/08/2025, 5:03 PM
    Hey Prefect team, it looks like the implementation for setting an environment_name job variable for a Modal pool was broken by recent Modal updates. You can no longer pass environment_name as an argument during Sandbox creation (deprecated in Modal 1.1.0); this causes any Prefect job with this variable set to crash immediately upon submission. Could you please fix this or remove that job_variable from the template? FWIW, this unexpectedly broke all of my Modal flows; would be great if you all could keep a closer eye on Modal deprecations now that they are in the 1.0+ state; since your Modal submission code is closed source, it is hard to fix/debug on my end.
    👀 1
    j
    • 2
    • 2
  • j

    Joe D

    09/08/2025, 7:56 PM
    Our `Manage subscription`button under settings/billing is giving a general error message - is it just me?
    z
    • 2
    • 1
  • k

    Kiran

    09/09/2025, 10:03 AM
    hey @Marvin Iam running my prefect flows by creating the deployments on ecs using faragate spot , i see that iam getting my deployment id ,name, flow name and flow run id on my ecs tasks as tags in the aws console, how are they getting generated
    m
    • 2
    • 5
  • m

    Miguel Moncada

    09/09/2025, 11:43 AM
    @Marvin how do I set extra pip packages in a managed deployment in the Flow.deploy() function in a python sdk?
    m
    • 2
    • 6
  • m

    Miguel Moncada

    09/09/2025, 2:34 PM
    @Marvin all my flows are late, enter in Pending state and then crash with "Flow run infrastructure exited with non-zero status code: Essential container in task exited (Error Code: 1)", I'm using the cloud's managed work pool and execution infrastructure
    ✅ 1
    m
    j
    • 3
    • 29
  • s

    Shareef Jalloq

    09/09/2025, 3:23 PM
    Hi all, I'm trying to use the
    GitRepository
    source for my flow deployments but now realise my mistake as I need to both
    git clone
    and
    pip install
    my package. What's the correct way to clone and install my deployments? The docs don't provide examples for this sort of flow.
    • 1
    • 1
  • d

    David Michael Carter

    09/09/2025, 3:56 PM
    How can I completely block these Prefect logs? I did not opt in to this. Seems to be a behavior change from 2.x to 3.x
    Copy code
    Finished in state Completed(message=None, type=COMPLETED, result=ResultRecord(metadata=ResultRecordMetadata(...
    Every time one of my tasks finishes, its outputs are involuntarily printed to the cloud logs. This is unacceptable as some task outputs contain secrets such as auth tokens, and now we have to deal with leaked secrets.
    n
    • 2
    • 8
  • c

    Court

    09/09/2025, 4:48 PM
    Am I the only one having a hard time editing my work-pools? They seem to only crash with I load the edit workflow page on the browser. I'm using Safari and Brave and I'm getting the problem on both. When I use safari it just loads slowly and doesn't let me complete the edited page and on Brave it just crash.
    m
    • 2
    • 14
  • j

    Joe

    09/10/2025, 4:23 AM
    Im constantly getting this message despite having a version higher than that mentioned.
  • k

    Kiran

    09/10/2025, 6:19 AM
    hi @Marvin , I am calling some sub flows (another deployment)from a main flow/deployment, the main deployment takes some tasks from a config file and calls the subflow/sub-deployment by iterating over each task, i am triggering 748 tasks, so 748 sub flows , I see that more than 748 subflows getting triggered and i see that that for some task names more than 1 subflow is getting triggered, can you explain this behavior?? is it something that is observed before?
    m
    • 2
    • 2
  • k

    Kiran

    09/10/2025, 7:29 AM
    hi @Marvin Sep 3rd, 2025 Worker 'ECSWorker 7b242ba6-2b7d-4a2c-a8b7-0acb33b775b7' submitting flow run '18e89d5d-0b60-410d-ae45-6eb4f60a055e' 062743 PM prefect.flow_runs.worker Retrieving ECS task definition 'arnawsecsregionaccountnumtask definition/somegroup 0131'... 064811 PM prefect.flow_runs.worker Ignoring task definition in configuration since task definition ARN is provided on the task run request. 064811 PM prefect.flow_runs.worker Using ECS task definition 'arnawsecsaf south regionaccountnumtask definition/somegroup 0131'... 064811 PM01:31'... 064811 PM prefect.flow_runs.worker Creating ECS task run... 064812 PM prefect.flow_runs.worker Waiting for ECS task run to start... 064813 PM prefect.flow_runs.worker ECS task status is PROVISIONING. 064813 PM prefect.flow_runs.worker ECS task status is PENDING. 064834 PM prefect.flow_runs.worker ECS task status is RUNNING. 064904 PM prefect.flow_runs.worker this is from one of my prefect flow run log , here the worker is submitting the run at 6:27, but the task is getting created at 6:48, why ? is it becuse iam submitting too may runs
    m
    • 2
    • 14
  • x

    xiaotian lu

    09/10/2025, 7:39 AM
    Hi ~ Is there any way to make the cached task_run show the artifacts of the original task_run on the web UI?
  • s

    Syméon del Marmol

    09/10/2025, 11:42 AM
    Hi @Marvin I'm using background tasks to trigger flows (with sub-flows and tasks / sub-tasks) that might take some time to finish. I have multiple replicas of the backend calling the background tasks as well as multiple replicas of the workers. I must make the system resilient to restarts (no-downtime deployment, potential crash of workers, etc.). The flow is idempotent so it can be restarted by another worker after the original one would have been stopped. => So I need a way to detect flows that are not attached to a worker anymore and "reschedule" them accordingly. What would be the appropriate way to achieve that ? I saw this page talking about zombie flows detection but my flows are still stuck in running after I restart my worker container. Asked Marvin and it told me the flow would move to "crashed" state, while it stays "running" forever for me. (Using Prefect 3.4.17) Thanks ! 🙂
    m
    • 2
    • 12
  • d

    Dennis Hinnenkamp

    09/11/2025, 11:04 AM
    @Marvin I deployed a self hosted prefect on google cloud kubernetes. The server cant be setup and I got serveral postgres errors: prefect-server-postgresql-connection
    m
    • 2
    • 10
  • i

    itielOlenick

    09/11/2025, 3:27 PM
    @Marvin we are running a prefect deployment in k8s and are hitting issues when going over 200 parallel runs. Running latest version 3.4.17. also tried 3.4.1. We are using a postgres database. When adding pgbouncer we configured the pool size to 200 and 300. Hitting sqlalchemy timeouts. After that we tried increasing the pool size to 2000. Increased the pool size on pgbouncer to 3000. Same issues. We are seeing around 700 connections. We tried connecting directly to the db. 300 parallel jobs ran. Going beyond 1000 we ran into maxing out our db, not a small one - 64gb me more and 8 vcpu What is going on?
    m
    • 2
    • 7
  • b

    Brandon Robertson

    09/11/2025, 6:45 PM
    @Marvin I tried setting up a block using the Custom Webhook type, when I click the 'Create' button, none of the parameters show up in the UI. Any suggestions?
    m
    • 2
    • 8
  • o

    Oleksii Stupak

    09/12/2025, 7:11 AM
    how to change my profile email in prefect?
  • r

    Ricardo Gaspar

    09/12/2025, 11:17 AM
    hello there! Does anyone know if there's a final state to signal that a given workflow has warnings? Currently there's essentially
    Failed
    or
    Success
    https://docs.prefect.io/v3/concepts/states#states I need for my data pipelines to be able to signal warnings, but I don't have a proper way to signal that via the DAG final state; which would be ideal for observability. Is there anything to support 3 final workflow states, like a semaphore: •
    Failed
    🔴 •
    Warnings
    /
    Success_with_warnings
    🟡 •
    Success
    🟢 Or any other monitoring solution within the
    UI
    or observability side of
    Prefect
    that allows that? CC: @Nathan Nowack @Brendan O'Leary @Anna M Geller
  • p

    PyHannes

    09/12/2025, 11:50 AM
    Are there any information about self-hosting an enterprise version of Prefect Cloud? I'm evaluating Prefect for our company (Infineon) and on-prem is a hard requirement. Any information about how the service and pricing models is designed in this case?
    a
    • 2
    • 1
  • o

    Owen Boyd

    09/13/2025, 8:20 PM
    Prefect appears to be hanging on task shutdown (see attached) waiting for the tstate (thread state?) lock to do some telemetry work. Task has been going for 30 min, has a timeout of 60s, and has not been retried per the prefect UI. Is there something I might have done to cause this? Any tips to troubleshoot further?
    m
    • 2
    • 8
  • i

    Ishan Anilbhai Koradiya

    09/14/2025, 2:09 PM
    @Marvin my flow runs keep crashing. I do see some error in the runs but it should be going to failed state and not crashing state. I am on prefect 3.1.15
    m
    • 2
    • 31