https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-server
  • c

    Christopher Chong Tau Teng

    12/07/2021, 9:56 AM
    Hi @Anna Geller @Kevin Kho I am now running Docker Agent in a container, and I can see new containers being spun up on the host machine for each flow (instead of being spun up within the agent container). However, this docker agent inside a container is facing issue pulling images from gcr.io
    500 Server Error for <http+docker://localhost/v1.41/images/create?tag=v3&fromImage=gcr.io%2Fchristopherchong-mysdev00-id%2Fprefect-flows>: Internal Server Error ("unauthorized: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: <https://cloud.google.com/container-registry/docs/advanced-authentication>")
    Is there any way we can pass docker credentials to Docker Agent or Docker Run? Or is there some other way I can authenticate this Docker Agent inside a container to pull image from GCR?
    a
    k
    5 replies · 3 participants
  • r

    Romain P

    12/07/2021, 10:42 AM
    Hello everyone, nice to be there with you. I'm currently installing prefect locally on my server. (it is an arch vm) I am having issues with the deployment, as I can't :
    prefect server create-tenant
    I've tried with 3.8 and 3.9.9 python It worked a few days ago, but I had to drop the containers and recreate them. I think it is related to my docker deployment as I already have a previous postgres install on 5432. I'm willing to restart from the docker-compose file. Any help ?
    a
    2 replies · 2 participants
  • g

    Guilherme Petris

    12/07/2021, 11:18 AM
    Hi! I was troubleshooting some of the things that i’m trying to run on prefect and even with a simple flow i’m getting some errors. I sat down with a consultant that wrote a test script and we figure it out that it’s not running on my end only. Already tried to reinstall everything but i’m still getting the same issue. Here is the test script - i’m just running this on my IDE
    from prefect import task, Flow
    from prefect.executors import LocalDaskExecutor
    import time
    
    
    @task
    def extract_reference_data():
        time.sleep(10)
        return 'hej'
    
    
    @task
    def extract_live_data(input):
        time.sleep(10)
        return f'{input}hejdå'
    
    
    @task
    def separate_task():
        time.sleep(10)
        return 'hoppsan'
    
        
    with Flow("Aircraft-ETL",
              executor=LocalDaskExecutor()) as flow:
        reference_data = extract_reference_data()
        live_data = extract_live_data(reference_data)
        separate_task()
    
    flow.run()
    # flow.visualize()
    a
    10 replies · 2 participants
  • s

    Saurabh Indoria

    12/07/2021, 11:38 AM
    One of the flow runs encountered this error. Usually the runs succeed and this was an anomaly. Can someone share some details about this error? How can we prevent this from happening and why didn't it retry?
    Pod prefect-job-94453cb1-2sw9q failed.
    	Container 'flow' state: terminated
    		Exit Code:: 139
    		Reason: Error
    a
    2 replies · 2 participants
  • g

    Gagan Singh Saluja

    12/07/2021, 2:13 PM
    Hi I am passing Dockerfile as an argument in docker run , where I mention to copy all my helper files in current directory, but when i run the flow it says , unable to import the file
    a
    5 replies · 2 participants
  • g

    Guilherme Petris

    12/08/2021, 9:52 AM
    It is possible to create the tasks in a separate file and import everything just to run in one flow? The idea is that something changes, i would be able to see directly on Git which file has changed. I tried to do on my end, but the task didn’t ran.
    a
    d
    8 replies · 3 participants
  • c

    Christopher Chong Tau Teng

    12/08/2021, 10:18 AM
    hi @Anna Geller @Kevin Kho, how does one decide how many Docker Agent to run? How many concurrent flow task can a Docker Agent handle?
    a
    4 replies · 2 participants
  • s

    Sylvain Hazard

    12/09/2021, 8:41 AM
    Hi ! I have search the docs for an answer but could not find much so I thought I would ask here. How does the Prefect engine deal with submitted
    KubernetesRun
    based flows that remain
    pending
    for some reasons. For example, what happens if I try to submit a flow but there isn't enough resources available on my cluster at that moment ? From my experience, I can see that those flows get re-submitted and another pod is created after some time but what happens then ? Both pods will run if given the resources ? Is there a limit after which the engine kills the flow run because of being unable to run it properly ?
    a
    9 replies · 2 participants
  • a

    Adam Everington

    12/09/2021, 10:52 AM
    Is it possible to add an SSL certificate to a self-hosted prefect server on an azure vm?
    a
    k
    8 replies · 3 participants
  • w

    William Clark

    12/09/2021, 6:24 PM
    Hello, when using GitHub based storage is it possible to use other files located on the repo within tasks?
    k
    m
    10 replies · 3 participants
  • s

    Saurabh Indoria

    12/10/2021, 4:36 AM
    Hi, I was looking at custom job templates for Kubernetes run. In the documentation https://docs.prefect.io/orchestration/agents/kubernetes.html#custom-job-template, the link given for the default template is https://github.com/PrefectHQ/prefect/blob/master/src/prefect/agent/kubernetes/job_template.yaml but https://github.com/PrefectHQ/prefect/blob/master/src/prefect/agent/kubernetes/job_spec.yaml this file appears to be more like the default file. 1. Can you please confirm which is the default file used by Prefect? 2. If I just have to modify the CPU limits, should I copy
    job_spec.yaml
    and modify the CPU section leaving the rest as it is?
    k
    2 replies · 2 participants
  • c

    Christopher Chong Tau Teng

    12/10/2021, 7:54 AM
    @Anna Geller @Kevin Kho is there a CLI command to retrieve registered tenant or check if there is registered tenant from Prefect server? something like
    get_available_tenant
    in Python lib
    a
    2 replies · 2 participants
  • j

    jack

    12/10/2021, 7:21 PM
    How can we check whether all flow-run-logs have arrived at prefect cloud? When a flow-run is marked as failed, we wait 10 seconds, then query the flow-run-logs to see what happened. Sometimes 10 seconds isn't long enough of a delay, and instead of getting all of the logs, we only get the first N. Is there some flag we can wait for instead of waiting an arbitrary duration, so that we can be sure to get all the logs?
    a
    k
    11 replies · 3 participants
  • p

    Pierre Monico

    12/11/2021, 2:36 PM
    Does anyone have a solution for the weird Azure Postgres username format (
    user@host
    ) when used in a connection string? If I replace it by
    %40
    I have the
    graphql
    service complaining about an invalid interpolation - If I leave it in the bit after the @ is parsed as the host (including the password etc) by
    hasura
    … Is there any other format I can pass to
    --postgres-url
    ?
    k
    j
    18 replies · 3 participants
  • s

    Scarlett King

    12/13/2021, 1:22 PM
    Hi guy, has anyone use configmap to pass env var to Kubernetes agent via helm chart before? If so, how did you do it?
    a
    9 replies · 2 participants
  • p

    Payam Vaezi

    12/13/2021, 3:09 PM
    For some reason I’m getting large different memory usage when running prefect local dask with mapped task using prefect core vs when running the same workflow in prefect server. For my workflow I’m getting about max
    3GiB
    memory usage, while running in cloud with prefect server I’m getting above
    16GiB
    memory usage where job gets killed as a result of that. Any idea what may have caused this discrepancy in memory usage?
    k
    a
    31 replies · 3 participants
  • a

    Aleksandr Liadov

    12/13/2021, 5:11 PM
    Hello guys, Does anyone have a problem with displaying logs from custom libs (they use the standard python logger) in cloud Prefect? However, if I run my flows using
    server
    as backend all logs are displayer correctly (both Prefect logs and my custom logs).
    k
    11 replies · 2 participants
  • w

    William Clark

    12/13/2021, 8:15 PM
    Hello again! Is it possible to define the storage within the flow itself? For example:
    k
    a
    7 replies · 3 participants
  • w

    William Clark

    12/13/2021, 8:16 PM
    @task(name="Create Task Definition")
    def create_task(task_info:List):
        """Build and upload task defintion file to S3 from docker image and tag job parameters  
    
        Args:
            task_info (List): [A list that contains the repository and tag strings]
    
        Returns:
            None
        """
    
        fs = s3fs.S3FileSystem(use_ssl=False)
    
        bucket_path = 'prefect/task_definitions/'
        
        task_definition = json.load(fs.open(f'{bucket_path}/model_template.json', 'rb'))
        
        task_definition = task_definition['containerDefinitions'][0]['image_name'] = task_info[0]
        
        json.dump(task_definition, fs.open(f'{bucket_path}/{task_info[0]}_model_scoring.json','w'))
        
        return task_definition
    
    with Flow(name="ECS Task Defintion to Run Config") as flow:
    
        task_definition = create_task(task_info)
    
        run_config = ECSRun(task_definition=task_definition, 
                        run_task_kwargs=dict(cluster="Innovation-Garage-AI-Cluster"))
    
    flow.run_config = run_config
  • l

    Liam England

    12/14/2021, 7:23 PM
    Hi folks, Trying to standup a single node deployment on AWS EC2 using docker-compose to poke around with / share with my team. Wondering if anyone has a sample config.toml or could point me in the right direction for exposing the UI, etc.
    a
    3 replies · 2 participants
  • d

    Daniel Komisar

    12/14/2021, 7:51 PM
    Hi everyone, I am trying to retrieve logs from prefect server. For a single flow run I am seeing log messages print out of order. They are being printed in timestamp order (and I’ve verified the wrongly ordered lines do not have the exact same timestamp). I am guessing the timestamps are created on the server? Is there any way to embed a sequence number in the log messages or a local timestamp from the agent, or do I have to put something in the log message itself? Thanks.
    k
    29 replies · 2 participants
  • m

    Michael Ulin

    12/14/2021, 9:45 PM
    Hi, I'm trying to use GCSResult as a result handler for my flow (we're using a docker agent). Whenever I set the result handler, I get the below error (and whenever do not specify a result handler the flow runs fine). I've installed the google python library in the docker image and set my GCP credentials as a Prefect Secret. Do you have any ideas about the issue here?
    k
    a
    75 replies · 3 participants
  • m

    Michael Ulin

    12/14/2021, 9:45 PM
    Unexpected error: ModuleNotFoundError("No module named 'google'")
    Traceback (most recent call last):
      File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner
        new_state = method(self, state, *args, **kwargs)
      File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 926, in get_task_run_state
        result = self.result.write(value, **formatting_kwargs)
      File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 77, in write
        self.gcs_bucket.blob(new.location).upload_from_string(binary_data)
      File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 39, in gcs_bucket
        from prefect.utilities.gcp import get_storage_client
      File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/utilities/gcp.py", line 6, in <module>
        from google.oauth2.service_account import Credentials
    ModuleNotFoundError: No module named 'google'
  • r

    Raúl Mansilla

    12/14/2021, 10:28 PM
    Hello guys!! Anyone has Prefect Server in AWS running in EC2 but with a HA setup? main and backup? I´m wondering which would be the best setup using an ALB in front and a RDS ddbb…Thanks!
    k
    a
    7 replies · 3 participants
  • r

    Raúl Mansilla

    12/15/2021, 11:15 AM
    Hi there, I guess that no, but, is Prefect using Log4j?
    a
    2 replies · 2 participants
  • w

    Will Skelton

    12/15/2021, 5:04 PM
    Hi all, I'm working through the getting-started tutorial but have run into an issue running the first basic flow. I'm trying to run the script in jupyter but the "import prefect" is failing. I went to my prefect virtual python environment and tried to get confirmation of the prefect version I have installed and I get an error when I run "prefect --version" (I guessed that this was how to see if prefect was installed successfully) . Here is the error. Any thoughts on beginning to troubleshoot this? I also tried "prefect server start" and got the same error. Thanks!
    ✅ 1
    k
    a
    4 replies · 3 participants
  • c

    Connor Martin

    12/15/2021, 5:11 PM
    Hey all, I'm probably making this much harder than it should be, but how do I restart a flow in the UI with the exact same parameters without any task result cacheing? I've turned checkpointing off on all of my tasks and I've even expanded
    merge()
    to allow me to pass kwargs to the
    Merge
    constructor so I can turn off that task checkpointing, but it still picks up in the middle of my flow at the task that failed. I can't have this happen because it requires previously downloaded data referenced in other tasks that get cleaned on failure/success. My question is: How can I restart a flow in the UI from the beginning with the exact same parameters without any task result caching?
    k
    8 replies · 2 participants
  • s

    Stéphan Taljaard

    12/15/2021, 8:27 PM
    Hi. Is there a GraphQL query I can do to get the default results location?
    k
    a
    6 replies · 3 participants
  • b

    Bogdan Bliznyuk

    12/16/2021, 7:13 AM
    Hello all! We're trying to add auto-scaling rules for the prefect agents (e.g. local agent) Is it okay to add/remove prefect agents dynamically? I can see on the Prefect Cloud UI that agents are not removed from the dashboard after stopping. Also, is it a valid use case for Prefect to run a flow task to spin-up another agent and run a subflow on that agent?
    a
    16 replies · 2 participants
  • p

    Payam Vaezi

    12/16/2021, 1:38 PM
    We are having some very highly mapped jobs (around 10 jobs each having 200,000 mapped tasks) which is causing strain for the prefect server on task status update making apollo unavailable. Any recommendation to reduce the strain (aside from auto-scaling of prefect server)?
    a
    2 replies · 2 participants
Powered by Linen
Title
p

Payam Vaezi

12/16/2021, 1:38 PM
We are having some very highly mapped jobs (around 10 jobs each having 200,000 mapped tasks) which is causing strain for the prefect server on task status update making apollo unavailable. Any recommendation to reduce the strain (aside from auto-scaling of prefect server)?
a

Anna Geller

12/16/2021, 1:47 PM
In general, you could consider migrating to Prefect Cloud which was built for such scale, as this answer shows. If you need a quicker solution, perhaps you can split this into separate flows so that each can handle a smaller amount of tasks?
p

Payam Vaezi

12/16/2021, 1:55 PM
Thanks for the response!
🙌 1
View count: 2