https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-server
  • m

    Michael Ulin

    12/17/2021, 12:56 AM
    I'm trying to run a workflow using the VertexAgent. I enabled the Vertex API on our GCP account, but I got the following error. Do you know how to set these values?
    a
    s
    • 3
    • 21
  • m

    Michael Ulin

    12/17/2021, 12:56 AM
    400 List of found errors:	1.Field: job_spec.worker_pool_specs[0].container_spec.env[6].value; Message: Required field is not set.	2.Field: job_spec.worker_pool_specs[0].container_spec.env[5].value; Message: Required field is not set.	 [field_violations {
      field: "job_spec.worker_pool_specs[0].container_spec.env[6].value"
      description: "Required field is not set."
    }
    field_violations {
      field: "job_spec.worker_pool_specs[0].container_spec.env[5].value"
      description: "Required field is not set."
    }
    ]
  • n

    Nikita Samoylov

    12/17/2021, 11:19 AM
    Hi guys, we used prefect cloud version connected with 4 agents. When we fired a flow one agent was chosen to execute this flow - that worked fine. Yesterday we switched to self hosted backend server and when we fire a flow ✅ we can see at GUI which agent is chosen to execute it - at this point everything looks OK ❌ in real life this flow is executed on all 4 workers simultaneously. So we waste a lot of resources. Can someone point me out how to debug this or maybe I miss something?
    a
    k
    • 3
    • 5
  • d

    Daniel Komisar

    12/17/2021, 1:27 PM
    Hello everyone, is there any information on the data retention policy for prefect server? Thank you.
    a
    k
    • 3
    • 11
  • m

    Michael Ulin

    12/18/2021, 10:43 PM
    Is it possible to use Prefect Orion with Prefect Cloud with Orion's current release? It'd be great to be able to utilize async code in our workflows.
    a
    n
    • 3
    • 4
  • t

    Thomas Opsomer

    12/20/2021, 2:45 PM
    Hi, we are experiencing 2 strange things: • flows with 'manual validation' are stuck after being approved, the states stay orange in "Resume" • restart of failed flows doesn't work: after click on the restart button, the wheel keeps spinning and the flow never restart Any ideas on how to overcome these issues ?
    a
    • 2
    • 10
  • l

    Lyla

    12/20/2021, 3:15 PM
    Hello everyone, sorry if this is a dumb question, we are trying to make the prefect agent running on a ec2 instance in one aws account, to launch an instance, attach an instance profile, and then terminate it on another aws account. Is that possible with prefect? We have one role in the first aws account, but I think that to be able to do this I need another role in the second aws account, but I m not sure how to let the prefect agent knows about the role on the second aws account
    a
    • 2
    • 3
  • m

    Madison Schott

    12/20/2021, 4:11 PM
    Is anyone else seeing their pipelines error out with this message?
    Flow run is no longer in a running state; the current state is: <Failed: "HTTPSConnectionPool(host='api.prefect.io', port=443): Max retries exceeded with url: / (Caused by ReadTimeoutError("HTTPSConnectionPool(host='api.prefect.io', port=443): Read timed out. (read timeout=15)"))">
    a
    • 2
    • 1
  • s

    Sam Werbalowsky

    12/20/2021, 4:56 PM
    I’m running into an issue after upgrading from
    0.15.4
    to
    0.15.10
    regarding Git storage - running with a kubernetes agent and deployed via helm. The storage is set using environment variables as part of our CI.
    Failed to load and execute Flow's environment: ValueError('Either `repo` or `git_clone_url_secret_name` must be provided')
    The thing is, in the UI I can see the
    repo
    value, as it is created during registration. I am assuming it isn’t getting passed to the PrefectJob pod that gets spun up, but I’m not sure why that is. Any ideas?
    a
    • 2
    • 6
  • p

    Prasanth Kothuri

    12/20/2021, 7:35 PM
    I enabled slack notification on prefect flows, I receive notification when the state changes which is perfectly fine, however the link there points to localhost:8080 how do I properly link so that when some clicks they can go to that task run ? thanks
    <http://localhost:8080/default/task-run/bcb01c03-6507-40c3-8c31-c2a0a37a9868>
    a
    a
    • 3
    • 5
  • m

    Martin

    12/21/2021, 12:07 AM
    Is there any good references for running prefect server in a high availability configuration on k8s? Our team is currently making use of the helm chart, but we were wondering about increasing replicas of some of the services (although initially we tried with prefect-agent, which resulted in the flows getting run twice), wondering if the other pods would be safe to run with multiple replicas? Thanks!
    a
    d
    • 3
    • 5
  • a

    Ahmed Ezzat

    12/21/2021, 10:06 AM
    Hi, I'm seeing unusual errors. while using prefect with dask cluster. here is a more information: https://github.com/PrefectHQ/prefect/issues/5252
    flow.executor = prefect.executors.DaskExecutor(
                cluster_class=lambda: KubeCluster(
                    pod_template=make_pod_spec(
                        memory_request="64M",
                        memory_limit="4G",
                        cpu_request="0.5",
                        cpu_limit="8",
                        threads_per_worker=24,
                        image=prefect.context.image,
                    ),
                    deploy_mode="remote",
                    idle_timeout="0",
                    scheduler_service_wait_timeout="0",
                    env=dict(os.environ)
                    | {
                        "DASK_DISTRIBUTED__WORKER__MULTIPROCESSING_METHOD": "fork",
                        "DASK_DISTRIBUTED__SCHEDULER__ALLOWED_FAILURES": "100",
                    },
                ),
                adapt_kwargs={"minimum": min_workers, "maximum": max_workers},
            )
    a
    • 2
    • 13
  • r

    Raúl Mansilla

    12/21/2021, 10:23 AM
    Hello team, I´m having this error
    ImportError('Unable to import dulwich, please ensure you have installed the git extra')
    but I have installed prefect[gitlab] in prefect server and also in the dask cluster nodes.
    a
    • 2
    • 4
  • s

    Sylvain Hazard

    12/21/2021, 10:59 AM
    Hello ! I'm not really that fluent in Kubernetes so I figured I should ask here : would a RunNamespacedJob task work for launching CronJobs on my cluster ? I understand that does not necessarily make sense but I'd like to know if it would work 🙂
    d
    a
    • 3
    • 5
  • c

    Côme Arvis

    12/21/2021, 6:44 PM
    Hello ! We are currently blocked by the fact that our flows are stuck in the
    scheduled
    state, while these runs have no labels associated with a concurrency limit In addition, we indeed have an agent with the matching label, but nothing is happening Note that some of the first runs were able to be executed (less than 10) Any idea maybe ? 😕
    a
    j
    • 3
    • 5
  • n

    Nikita Samoylov

    12/22/2021, 8:50 AM
    Hi, tried to search answer in history but failed 😞 We have
    Cancel
    and
    Set state
    options in UI for each running flow. • If I press
    Cancel
    - flow is stuck in Cancelling status forever and what is more dangerous for us child process on agent machine which actually executed this flow is stuck too and is never killed. It means it does not release resources. I can see 2 processes stuck (as on picture) - 1 for flow execution and 1 for it's heartbeat. • Setting
    Failed
    state seems working well, but not if I set state after flow is cancelled. Could you tell me something about this behaviour ? PS: I'm talking about Cloud backend + Local Agent
    a
    • 2
    • 5
  • s

    Sylvain Hazard

    12/22/2021, 11:05 AM
    Hello ! Been trying to wrap my head around running an already existing k8s job from Prefect but I can't seem to figure it out. Here's where I am :
    from typing import Dict, List
    from prefect import Flow, Parameter, task
    import prefect
    from prefect.tasks.kubernetes.job import ListNamespacedJob, RunNamespacedJob
    from kubernetes.client import V1JobList, V1Job
    
    
    @task
    def get_job(jobs_list: V1JobList) -> Dict:
        candidates = jobs_list.items
        job = candidates[0]
        if len(candidates) > 1:
            prefect.context.get("logger").warning(
                f"Multiple candidates retrieved. Chose {graph_job.metadata.name}."
            )
        return job.to_dict()
    
    
    with Flow("Test") as flow:
        jobs = ListNamespacedJob(
            kube_kwargs={"field_selector": "metadata.name=JOB_NAME"},
            kubernetes_api_key_secret=None,
        )()
    
        job = get_job(jobs)
    
        job_result = RunNamespacedJob(
            kubernetes_api_key_secret=None,
            delete_job_after_completion=False,
        )(body=job)
    Right now, this gets a 422 error starting with "Job.batch JOB_NAME is invalid..." from the k8s API when trying to run the job. Am I just doing it wrong ?
    a
    • 2
    • 4
  • p

    Prasanth Kothuri

    12/22/2021, 11:50 AM
    can I enforce order of the tasks in the flow ? today I noticed that some tasks were scheduled before even though they are right at the end in my code
    a
    • 2
    • 2
  • j

    Jawaad Mahmood

    12/22/2021, 7:08 PM
    Hello! I'm stuck on something that should be simple, but can't find answer in channel history or docs! I am using prefect.storage.Docker for flow storage inside Docker image. I would like to push this image up to Dockerhub. When I try that using the code below, I get "InterruptedError: unauthorized: authentication required" error. I am able to successfully login to Dockerhub using Docker SDK, but not sure how to pass this authenticated client to prefect.storage.Docker. And I am not sure how I authenticate within prefect.storage.Docker. As suggested by the docs, I am running the Docker daemon locally, and logged into docker in the terminal both of which are configured to push up to the repository. How can I fix? Thanks!
    from prefect import task, Flow
    from prefect.executors import LocalExecutor
    from prefect.run_configs import DockerRun
    from prefect.storage import Docker
    import docker
    
    with Flow("some_flow") as flow:
        do_something
    
        docker_client = docker.DockerClient()
        docker_client.login(username=<env_user>,password=<env_pass>)
    
        flow.storage = Docker(
            registry_url='<http://registry.hub.docker.com/repository/docker/<user>/<repo>|registry.hub.docker.com/repository/docker/<user>/<repo>>'
            ,image_name='<some_flow>'
            ,files={
                <origin path>:<dest path>
            }
            ,python_dependencies = ['pandas','numpy','prefect']
            ,env_vars={
                "PYTHONPATH": "$PYTHONPATH:assets/:root/:data/:image"
            }
            ,base_image='python:3.7.3'
        )
    
        flow.run_config = DockerRun(labels=['my-label']
                                    )
        flow.executor = LocalExecutor()
        flow.register(project_name="some_project")
    a
    • 2
    • 5
  • k

    Kyrylo Zaitsev

    12/23/2021, 10:24 AM
    Hi. I'd like to embed a markdown artifact in my pipeline, but I'm facing the issue with the artifact size limit:
    requests.exceptions.HTTPError: 413 Client Error: Payload Too Large for url: <http://0.0.0.0:4200/>
    This markdown I obtained by converting an HTML page in .md format. I deployed prefect using docker-compose, is there any way to increase GraphQL payload size limit?
    a
    • 2
    • 2
  • s

    Suresh R

    12/23/2021, 10:44 AM
    Hi, I'd like to know how we can manage authentication and role based access in prefect server deployment.
    a
    • 2
    • 1
  • p

    Prasanth Kothuri

    12/23/2021, 2:12 PM
    We run flows every 30 minutes which means we accumulate lot of flow runs , can I specify retention on flow runs ? if not where is this data stored so that I can setup a script to do periodic cleanups
    a
    • 2
    • 4
  • r

    Raúl Mansilla

    12/23/2021, 2:27 PM
    Hello all, I´m facing an issue that I´m not able to fix easily…`RuntimeError: Unable to find any timezone configuration`
    a
    k
    • 3
    • 17
  • a

    Alexis Lucido

    12/23/2021, 5:19 PM
    Hello everyone. I am still exploring Prefect's functionnalities and would like to use State Handlers to send alerts in case of failure. I was wondering whether we could pass some more arguments such as email receivers, title and body to a State Handler signature? I would like to use some yml file already configured rather than configuring Prefect Secrets. Thanks a lot, and happy holidays!
    k
    • 2
    • 2
  • d

    dev

    12/24/2021, 10:48 AM
    Hello everyone. I am using prefect server and local agent to setup. The prefect agent keeps throwing 
    ERROR - agent | Failed to query for ready flow runs
    , even though I am able to run the flows.
    a
    • 2
    • 5
  • a

    Alexis Lucido

    01/03/2022, 4:41 PM
    Hi all, and all the best for the year to come. My Prefect processes running locally generated too many logs, filling up the memory of my on-premise virtual machine. Our Ops team increased the size of the VM memory and I have disabled checkpointing, however many late runs are registered (several thousands). I want to delete these late runs, but I guess I need to launch our agent (a local-type one) that tries to catch up with all the late runs I am trying to delete through the UI. Thus, the process of deleting the late runs is very slow, i.e. it deletes a few of them per minute. I add that we persist the backend database to keep an audit trace, or at least the best one we could have. Is it possible to delete the late runs without lauching the agent / trying to catch up with the old ones? Thank you very much in advance. Alexis
    k
    • 2
    • 14
  • y

    Yash

    01/04/2022, 3:44 AM
    Hi everyone, is there any way to get flow run name inside flow while executing or any way to pass flow run name as a parameter for scheduled flow?
    k
    a
    • 3
    • 8
  • e

    Elliot Oram

    01/04/2022, 4:30 PM
    Hi all, When I try an do
    prefect agent local start --api <http://prefect-server-url-goes-here:4200>
    (replacing the
    perfect-server-url-goes-here
    with the actual server address) I get a connection refused. There is a prefect server running on that machine and I have whitelisted my local IP for port 4200 on that machine. Not too sure how to go about debugging this one so any pointers would be greatly appreciated.
    [2022-01-04 16:27:48,100] INFO - agent | Registering agent...
    Traceback (most recent call last):
      File "/Users/elliotoram/dev/pipeline/venv/lib/python3.9/site-packages/urllib3/connection.py", line 169, in _new_conn
        conn = connection.create_connection(
      File "/Users/elliotoram/dev/pipeline/venv/lib/python3.9/site-packages/urllib3/util/connection.py", line 96, in create_connection
        raise err
      File "/Users/elliotoram/dev/pipeline/venv/lib/python3.9/site-packages/urllib3/util/connection.py", line 86, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 61] Connection refused
    k
    • 2
    • 3
  • s

    Sam Werbalowsky

    01/04/2022, 9:02 PM
    My understanding of the KubernetesRun is that the env passed only applies to the prefect job pod that spins up - is there a straightforward way to pass env vars to the dask worker pods (using desk gateway)? I am thinking of using parameters based off the environment during registration, but not sure if that is the best approach.
    k
    • 2
    • 13
  • p

    Prasanth Kothuri

    01/05/2022, 8:58 AM
    I am getting
    [3 January 2022 8:26pm]: [Errno 24] Too many open files
    error , what is the recommended value for open files ?
    a
    • 2
    • 2
Powered by Linen
Title
p

Prasanth Kothuri

01/05/2022, 8:58 AM
I am getting
[3 January 2022 8:26pm]: [Errno 24] Too many open files
error , what is the recommended value for open files ?
a

Anna Geller

01/05/2022, 9:15 AM
it looks like you don’t close files after opening them. Do you have some unclosed files in your flows? this post explains the problem
p

Prasanth Kothuri

01/05/2022, 12:13 PM
we read and write from s3 in prefect task, nothing apart from that, the default were too low (1024) and I increased them for the time being to resolve the issue
👍 1
View count: 1