https://prefect.io logo
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-server
  • c

    Charles Leung

    10/26/2020, 5:14 PM
    Hey everyone, has anyone tried fronting prefect ui using nginx? when prefixed with a different url than the root, the assets aren't properly fetched. is there any documentation on how to configure prefect behind a reverse proxy?
    n
    • 2
    • 6
  • c

    Charles Leung

    10/26/2020, 9:08 PM
    Hey guys, in building my prefect flow, is it an option to only have one task execute in Fargate? e.g., ET run using docker Agent, and then the last task runs on Fargate?
    k
    • 2
    • 2
  • h

    Hagai Arad

    10/27/2020, 8:16 PM
    Hello 👋 I’m getting the following error when trying to create an instance of the Client class:
    import prefect
    c = prefect.Client()
    Traceback attached in a comment below. Any ideas what went wrong? Thanks!
    j
    • 2
    • 4
  • i

    Ian

    10/27/2020, 8:17 PM
    Hi Prefect team, I’m evaluating Prefect for use at my company and had a few questions i couldn’t find answers in the docs: 1. Does anyone know of a k8s deployment option for prefect-server? All i can find online is the docker-compose option using
    prefect server start
    2. Is it possible to register an already-built (docker storage) flow? Our product involves deploying to many customer environments that are dynamic, so we would like to be able to build a flow image in CI and deploy it to our customer installations. The question is then, how do you register a flow in an arbitrary number of environments when the build step is before the register step? Does anyone know good patterns to follow here? Thanks!
    j
    • 2
    • 11
  • j

    Jins Kadwood

    10/28/2020, 3:30 AM
    Hi team! Im exploring the use of Prefect too. I found the overall product proposition very good. However I was confused with the setup and installation process for AWS ECS (Fargate) while using Prefect Cloud. As I understand it: 1. Set up Prefect Cloud account ✅ 2. Docs say to set up a Prefect Agent? Tried to follow the guide, but was confused about the setup - There is no container image provided for the agent (using AWS Console). How should I go about setting the agent up on Fargate? 3. How many agents does Prefect require? 1 per job? Or is it more like Gitlab/Github runners where you can have a few shared/pooled agents to execute the jobs? 4. Any guidance on the compute size of the agents? I’ll be looking to run frequent 10M record ETLs. So wasnt sure the size of the agents need to support something like that? Apologies for the simple / dumb questions
    c
    b
    • 3
    • 7
  • a

    ale

    10/28/2020, 8:46 AM
    Hi folks, maybe a dumb question but… how can we check Prefect Core version from Prefect UI?
    j
    • 2
    • 4
  • j

    James Cole

    10/28/2020, 9:05 AM
    Hi all! I'm getting start with Prefect. It's very pleasing. I have a question about runs of the same flow overlapping in time. For example if there are several scheduled flow runs but no agent running, they might start to stack up. If then an agent is started then all the flow runs (for this single flow) will start at the same time. A similar scenario could also come about because a flow takes a long time to run and another starts before it's finished. Obviously I'm writing my flow so that runs don't interfere with other runs, but nevertheless I was wondering if there is a way to make this not happen. I.e. a flow run checks to see if there are other flow-runs currently running, and skips itself if that's the case. I can't see that I can do that using State, or by using a different Schedule. Is there an idiomatic way to do this?
    j
    • 2
    • 2
  • j

    Josef Trefil

    10/28/2020, 11:25 PM
    Hi everyone! I got stuck at Scaling Out with Kubernetes tutorial (running Kubernetes on Docker Desktop). When I run my flow from the UI, Prefect Agent gives me
    Deploying flow run ...
    and that's it. When I then check
    kubectl get all
    , the prefect-agent pod and deployment both show
    READY 0/1
    When I check
    kubectl logs deployment/prefect-agent
    the very last line of the traceback says:
    prefect.utilities.exceptions.ClientError: Malformed response received from API.
    Any idea what I'm doing wrong? 🤷‍♂️ Thank you so much for any clues! 🙂
    c
    m
    • 3
    • 13
  • f

    Faris Elghlan

    10/29/2020, 11:05 AM
    Hello, I just started out experimenting with Prefect and tried deploying the server on Azure. I think I got everything working now, but I did encounter two issues along the way that I want to share (could make the lives easier for future Prefect users). I am using version 0.13.12 1. I'm using a managed postgre database on Azure, which requires an '@' sign in the username (example: "user@server-name"). For the Hasura deployment I had to escape the '@' sign in the postgres connection string (example: "user%40server-name"), but this doesn't work for the GraphQL deployment (using a % sign is not suported); exception: "Error: invalid interpolation syntax in <connection string>". I did find a workaround for this issue: I'm now escaping the % sign for the GraphQL deployment (example: "user%%40server-name"). It would be nice if this could be done automatically. The full connection string now looks like this: "postgresql://<user%%40server-name>:<password>@<server-name>.postgres.database.azure.com:5432/&lt;db-name&gt;?sslmode=require" 2. After deploying, the "default" tenant was not created. When there are no tenants, clicking on the "Dashboard" button in the UI does not load the dashboard page or produce any error (nothing happens). This was difficult to debug, as I had no idea what was wrong/why nothing happened. Eventually I did manage to find the issue by attempting to create a new project, which returned an HTTP response containing the text: "Variable "$tenantId" of non-null type "UUID!" must not be null". (I fixed it by manually creating a tenant through the Prefect CLI: "prefect server create-tenant ...") It would be nice if there was some error message or other indication of what is wrong when there are no tenants. I am not sure why the "default" tenant was not created. When I started the server for the first time there were some issues with the postgre extensions (Prefect was unable to create them since it didn't have the right permissions, so I added them manually). Possible this caused some issues while initializing the DB.
    👍 2
  • h

    Henry

    10/29/2020, 4:10 PM
    hello
  • h

    Henry

    10/29/2020, 4:11 PM
    experimenting with prefect and setting it up with local kubernetes setup - was wondering if there was some good examples and documentation to take a look at
    j
    • 2
    • 4
  • h

    Henry

    10/29/2020, 4:11 PM
    i noticed that the prefect local install is designed to be setup from python
  • c

    Charles Leung

    10/29/2020, 9:10 PM
    hey all, when loading up the UI the default graphql endpoint needs to be set. I docker exec'd into the prefect ui container but i don't see a "~/.prefect/config.toml" how should i change this for all my users so that it's readily set when they visit the UI?
    n
    • 2
    • 10
  • m

    Max

    10/30/2020, 6:17 PM
    Hi everyone! I'm trying to setup Prefect with Kubernetes on my local machines. I managed to start all the required pods (
    postgres, towel, hasura, graphql, apollo, ui
    ), but from the logs it appears that something is wrong with either hasura or graphsql (the logs are in the thread). What could be the issue? How does one even debug this kind of errors?
    d
    s
    • 3
    • 11
  • h

    Henry

    11/03/2020, 10:45 PM
    Adding linting to a prefect project and it looks like we're getting
    E0611: No name 'task' in module 'prefect' (no-name-in-module)
    with pylint - has anyone run into this before?
    s
    • 2
    • 3
  • t

    takahashi

    11/04/2020, 6:15 AM
    Hi everyone! I have a question about logs stored in postgres. Daily logs are stored on the postgres server, but if you continue to store them, the amount of data will be enormous. Are there any best practices for managing this log? I'm considering outputting the data for a certain period to a text file and deleting the target record from postgres, but is there any problem?
    n
    • 2
    • 2
  • l

    Lukas N.

    11/05/2020, 2:25 PM
    Hello Prefect team 👋 . I think I have maybe found a bug with resolving Result location template string. I have a setup a hopefully reproducible example with 3 tasks and I think the catch is with having both data and state dependencies.
    n
    • 2
    • 5
  • d

    Dave

    11/07/2020, 3:03 PM
    Hey guys, Have someone experienced that one flow runs twice in the same execution? Due to a multiple local agent configured at the exact same time regardless of you trigger it manually or its scheduled. Duplicate runs / duplicate executions. (Check Thread)
    j
    • 2
    • 3
  • s

    simone

    11/09/2020, 11:23 AM
    Hi I am running prefect on an HPC with HTcondor using DaskExecutor. Before the flow I start the dask cluster using dask.jobqueue I went through the documentation but it is still not clear to me if I can use the UI to monitor my processes. What i tested so far: When I start the processing by running a python script that ends with:
    executor = DaskExecutor(address=cluster.scheduler_address)
    ……………………….
    flow_state = flow.run(executor=executor)
    everything runs fine and the mapped functions run in parallel. I would like to use the UI to monitor the processing and make use of the great logging functions. If I run
    flow.register(project_name="test")
    flow.run_agent()
    flow_state = flow.run(executor=executor)
    The process can start only in the UI (I guess because the run step in not executed). The process runs but run in serial and I guess because the flow.environment is not set and the default executor is used If i run
    flow.environment.executor = executor
    flow.register(project_name="test")
    flow.run_agent()
    and start the flow from the UI: the flow starts but crush with the following error:
    Unexpected error: ConnectionError(MaxRetryError('None: Max retries exceeded with url: /graphql (Caused by None)'))
    I guess that prefect cannot connect to the scheduler of the dask cluster. Can you please let me know if what I am trying is possible or is not implemented? If it is possible can you let me know which approach I should use? Thanks a lot! I also tested the code below as suggested in a thread but I got the same error reported above
    flow.environment = RemoteDaskEnvironment(cluster.scheduler_address)
    flow.register(project_name="test")
    flow.run_agent()
    SOLUTION I solved the issue by starting the agent outside the script.
    prefect agent local start --api <http://Apollo_server_IP:4200>
  • r

    Roey Brecher

    11/09/2020, 1:29 PM
    Hello, it seems that prefect 0.13.14 has a breaking change. We’re getting this error (we did not update our Prefect Server).
    prefect.utilities.exceptions.ClientError: 400 Client Error: Bad Request for url: <http://10.0.4.45:4200/graphql>
    
    This is likely caused by a poorly formatted GraphQL query or mutation. GraphQL sent:
    
    query {
        mutation($input: create_flow_from_compressed_string_input!) {
                create_flow_from_compressed_string(input: $input) {
                    id
            }
        }
    }
    Is it expected that every minor change will require us to update our Server if we update the clients?
    g
    m
    • 3
    • 4
  • m

    M Taufik

    11/09/2020, 5:49 PM
    Hi I'm trying install prefect server in k8s cluster using helm, and the server is up. but when kubernetes agent installation i'm facing an issue, error log attached. thank you for checking, appreciate it
    c
    • 2
    • 9
  • d

    Dave

    11/10/2020, 10:08 AM
    Hi guys, Just found a issue when you change the state of your flow in UI, fx if a flow failed and you change the state to success. After you mark is as failed it would then get a new duration time, if you check my screenshot you can see the job finished 9:02 AM and marked it ass success around 53 minutes after and it got a new duration. If it isn't a bug, then please tell me why you chose this approach, thanks.
    j
    • 2
    • 4
  • r

    Roey Brecher

    11/10/2020, 12:51 PM
    Hi, It took me awhile to figure out that
    env_vars
    in the Docker Storage, are variables that are added to the end of my Dockerfile. this should probably be mentioned in the documentation. Since the ENV command is added to the end of the file, you cannot actually pass any variables that are used in the build process itself.
    j
    j
    • 3
    • 6
  • b

    brett

    11/10/2020, 5:56 PM
    Hi all. Having some issues with running multiple copies of the same flow (via FlowRunTask) simultaneously within a larger flow -- wondering if anyone has any advice. I setup a flow that triggers jobs in Azure Data Factory and another flow that executes a stored procedure in a MS SQL Server. Each of these requires a parameter to be passed in (ADF pipeline name or stored procedure name). I then create "master" flows that calls each of these flows specifying the parameter within the code. In one specific "master" flow, I call the ADF flow twice to trigger two separate runs passing different parameters in each. However, it appears as though the ADF flow is only being triggered once. While it appears as both flows ran independently, closer inspection into the actual task runs reveals the same run ID. Using LocalExecutor with Prefect Server. Perhaps I should run these in series instead of parallel? Perhaps I should update to StartFlowRun? Appreciate any advice and happy to share more info.
    c
    • 2
    • 3
  • j

    Joseph Haaga

    11/10/2020, 8:48 PM
    If I run
    prefect server create-tenant --name default --slug default
    but I already have a
    default
    tenant, will it overwrite the existing db tables?
    j
    • 2
    • 1
  • j

    Josef Trefil

    11/11/2020, 10:10 AM
    Hi, I'm trying to register this tutorial flow
    @task
    def hello_task():
        logger = prefect.context.get("logger")
        <http://logger.info|logger.info>("Hello, Cloud!")
    
    flow = Flow("hello-flow", tasks=[hello_task])
    flow.storage = Docker()
    flow.register(project_name="myproject")
    into Prefect running on server backend + desktop Docker with Kubernetes v 1.19.3 Running the flow works flawlessly but if I call flow.register() i get this:
  • j

    Josef Trefil

    11/11/2020, 10:10 AM
    Can anyone help? Thank you very much in advance! 🙂
    g
    j
    • 3
    • 3
  • j

    Josef Trefil

    11/11/2020, 10:10 AM
    C:\Documentos\Coding\_trials\kubernetes\venv\lib\site-packages\prefect\environments\storage\docker.py:351: UserWarning: This Docker storage object has no `registry_url`, and will not be pushed.
      self._build_image(push=push)
    [2020-11-11 11:09:22+0100] INFO - prefect.Docker | Building the flow's Docker storage...
    Step 1/9 : FROM prefecthq/prefect:0.13.14-python3.7
     ---> 93545f019d66
    Step 2/9 : RUN pip install pip --upgrade
     ---> Using cache
     ---> 04f7a3ff4972
    Step 3/9 : RUN pip show prefect || pip install git+<https://github.com/PrefectHQ/prefect.git@0.13.14#egg=prefect[kubernetes]>
     ---> Using cache
     ---> 19e43c078e99
    Step 4/9 : RUN pip install wheel
     ---> Using cache
     ---> e42095cd0936
    Step 5/9 : RUN mkdir -p /opt/prefect/
     ---> Using cache
     ---> 926010edf84c
    Step 6/9 : COPY hello-flow.flow /opt/prefect/flows/hello-flow.prefect
     ---> 70517ea739e4
    Step 7/9 : COPY healthcheck.py /opt/prefect/healthcheck.py
     ---> ce25a709678c
    Step 8/9 : ENV PREFECT__USER_CONFIG_PATH=/opt/prefect/config.toml
     ---> Running in 8e564fe71477
    Removing intermediate container 8e564fe71477
     ---> cae7074f0b02
    Step 9/9 : RUN python /opt/prefect/healthcheck.py '["/opt/prefect/flows/hello-flow.prefect"]' '(3, 7)'
     ---> Running in e0df9c491e35
    Beginning health checks...
    System Version check: OK
    Cloudpickle serialization check: OK
    Result check: OK
    Environment dependency check: OK
    All health checks passed.
    Removing intermediate container e0df9c491e35
     ---> e73382d6b699
    Successfully built e73382d6b699
    Successfully tagged hello-flow:2020-11-11t10-09-20-404746-00-00
    Traceback (most recent call last):
      File "C:\Program Files\JetBrains\PyCharm 2020.2\plugins\python\helpers\pydev\pydevd.py", line 1448, in _exec
        pydev_imports.execfile(file, globals, locals)  # execute the script
      File "C:\Program Files\JetBrains\PyCharm 2020.2\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
        exec(compile(contents+"\n", file, 'exec'), glob, loc)
      File "C:/Documentos/Coding/_trials/kubernetes/prefect/deployment_tutorial_flow.py", line 13, in <module>
        flow.register(project_name="myproject")
      File "C:\Documentos\Coding\_trials\kubernetes\venv\lib\site-packages\prefect\core\flow.py", line 1651, in register
        idempotency_key=idempotency_key,
      File "C:\Documentos\Coding\_trials\kubernetes\venv\lib\site-packages\prefect\client\client.py", line 788, in register
        retry_on_api_error=False,
      File "C:\Documentos\Coding\_trials\kubernetes\venv\lib\site-packages\prefect\client\client.py", line 281, in graphql
        retry_on_api_error=retry_on_api_error,
      File "C:\Documentos\Coding\_trials\kubernetes\venv\lib\site-packages\prefect\client\client.py", line 237, in post
        retry_on_api_error=retry_on_api_error,
      File "C:\Documentos\Coding\_trials\kubernetes\venv\lib\site-packages\prefect\client\client.py", line 413, in _request
        session=session, method=method, url=url, params=params, headers=headers
      File "C:\Documentos\Coding\_trials\kubernetes\venv\lib\site-packages\prefect\client\client.py", line 344, in _send_request
        raise ClientError(msg)
    prefect.utilities.exceptions.ClientError: 400 Client Error: Bad Request for url: <http://localhost:4200/graphql>
    This is likely caused by a poorly formatted GraphQL query or mutation. GraphQL sent:
    query {
        mutation($input: create_flow_from_compressed_string_input!) {
                create_flow_from_compressed_string(input: $input) {
                    id
            }
        }
    }
    variables {
        {"input": {"project_id": "f89dd540-50d6-4abf-86f5-0119fa63bc73", "serialized_flow": "H4sIAFO4q18C/41TTW/bMAz9K4XPs2K3wYb1NmDracfehkHgZFoRIkuGPtIFgf/7SMW10yDFCvgg8VGPfI/0qXIwYPV4V+3QWl/31r9Un+6qdBxLdAzYo0pC+YCCQfE0Z0S1wy5bznLZWoqMEIgrYYgU+/WbWSDuy+VURZv1UkYyULe3CpUnc7nsVDLeiaf58EwYv/E5jTkxMT83Totv7siAcXP8NNENcvJSo8MACTuK9mAjEqCAepe9D2vv59Aej2sogV6ExL0ZpXcyjzEFhEFygMAUMhOmYLRGpjtVvXsj54xEAdbKmJXCGPtsudn9CwR9bnZaOjiANR0kf4sLnTYOxVViFA4PGGSOeIM1YApH2aGFC2UD/JUMGOTEhgWYAcnVNeXNVpRxMbmUVCnSJKRksBHtg2i31cQWYadx8Ys7DugUymUFSo47mODdgC4VfRb+oF1Q2h0gSVAgM4C+2svH++a+qduWvtQ2dfO1puu22X7Zfq4bujfV9H6PF6v20yuwPy5aKUZlJ5V3vdGrCe9TRTL+3N+pnLGTEGVUwYzpYtGKCHn7D4uoaAaL+HMq7Rxn/ldosVibyMPNwa4tM3uZ/mU1Ytz4MW3mRdqUpM2aIGagGDGfX6VfCR8h7T5i0KvX373a048xTf8AKiEd/mkEAAA=", "set_schedule_active": true, "version_group_id": null, "idempotency_key": null}}
    }
    python-BaseException
  • c

    Carlo

    11/11/2020, 5:49 PM
    Any ideas? Upgrading to 0.13.5,
    Could not upgrade the database
    I was on 0.13.3, killed the processes, pip installed 0.13.5, then restarted
    graphql_1   | Could not upgrade the database!
    graphql_1   | Error: Can't locate revision identified by '24f10aeee83e'
    j
    • 2
    • 7
  • j

    JC Garcia

    11/11/2020, 6:46 PM
    Hi! We are running a POC on kubernetes. I have a flow that we want to run in k8s:
    environment = KubernetesJobEnvironment(job_spec_file=f"{local_dir_path}/job_spec.yaml")
    ...
    with Flow("k8s-example-flow", storage=storage, environment=environment) as flow:
        ...
    
    flow.register(project_name="Test Project")
    However when the job is scheduled via the UI, the k8s job spec does not match the one I generated and filled in the
    job_spec_file
    . I understand that some properties will be overwritten, but no env vars or resource requests/limits are coming through. Any pointers?
    j
    • 2
    • 10
Powered by Linen
Title
j

JC Garcia

11/11/2020, 6:46 PM
Hi! We are running a POC on kubernetes. I have a flow that we want to run in k8s:
environment = KubernetesJobEnvironment(job_spec_file=f"{local_dir_path}/job_spec.yaml")
...
with Flow("k8s-example-flow", storage=storage, environment=environment) as flow:
    ...

flow.register(project_name="Test Project")
However when the job is scheduled via the UI, the k8s job spec does not match the one I generated and filled in the
job_spec_file
. I understand that some properties will be overwritten, but no env vars or resource requests/limits are coming through. Any pointers?
j

josh

11/11/2020, 7:34 PM
Hi @JC Garcia when the kubernetes agent picks up a flow to be run it first creates a
prefect-job
that is responsible for pulling your flow’s environment and then creating the Kubernetes job that you set as the flow’s environment. Are you running into an issue with the initial
prefect-job
? This is actually something we are phasing out with the addition of the new RunConfig pattern (docs aren’t fully written yet and it’s still experimental). reference
j

JC Garcia

11/11/2020, 8:10 PM
Not with the initial job with the k8s agent, but rather with the KubernetesJobEnvironment. We set the
job_spec_file
but it does not use it
j

josh

11/11/2020, 8:11 PM
Hmm could you open an issue on the repo with some more information? I don’t think that code has been touched in a long while 🤔
j

JC Garcia

11/11/2020, 8:12 PM
so what would be the current best-practice for running flows as k8s jobs? is it the run config you mentioned?
j

josh

11/11/2020, 8:15 PM
The current method is still using environments however the run config stuff is an experimental new way of doing it. Keep in mind that the KubernetesJobEvironment isn’t necessary for running flows as k8s jobs unless you need fine grained control over the job yaml spec. (the default LocalEnvironment will still run flows that are shipped off by the kubernetes agent) If you want you could always try out the KubernetesRun to test it out (we believe the experience will be better) but it’s still experimental and is subject to change in the future 🙂
j

JC Garcia

11/11/2020, 8:16 PM
got it... yeah I got it running just fine in k8s for a simple flow without specifying the
job_spec_file
I'll dig through the code and see what the issue might be
j

josh

11/11/2020, 8:17 PM
Awesome yeah that sounds like it could be a bug if it’s not respecting the file!
j

JC Garcia

11/11/2020, 8:17 PM
thanks Josh!
@josh It's working fine, a dependency error was not letting the initial k8s job go through
View count: 2