https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-server
  • l

    Lon Nix

    11/03/2021, 2:47 PM
    I built my own image for using Git storage with ssh by doing what the docs said
    FROM prefecthq/prefect:latest
    RUN apt update && apt install -y openssh-client
    However now I get this every run
    Failed to load and execute Flow's environment: HangupException('Host key verification failed.\r')
    I created a known_hosts file by adding this to the Dockerfile but it didn't help.
    RUN mkdir ~/.ssh && ssh-keyscan -t rsa <http://github.com|github.com> >> ~/.ssh/known_hosts
    Any thoughts?
    👀 1
    a
    • 2
    • 8
  • s

    Santiago Gonzalez

    11/03/2021, 5:35 PM
    Hi. I have a prefect flow, that has a task (among others) that execute several commands in a EC2 Instance. The thing is that task takes about 3 or 4 days to finish, but when it has been running for 12 hours, it crashes because of timeout. Is this an expected behavior ? Does prefect tasks supports executions over that time? Is there a way to keep running it?
    k
    a
    • 3
    • 8
  • l

    Lon Nix

    11/03/2021, 5:56 PM
    In my flow I define this 
    job_template
    job_template = '''
    apiVersion: batch/v1
    kind: Job
    spec:
      template:
        spec:
          containers:
            - name: flow
              volumeMounts:
                - name: ssh-key
                  readOnly: true
                  mountPath: "/root/.ssh"
          volumes:
            - name: ssh-key
              secret:
                secretName: prefect-ssh-key
                optional: false
                defaultMode: 0600
    '''
    but what shows up for the job is actually
    volumeMounts:
        - mountPath: /root/.ssh
          name: ssh-key
          readOnly: true
      volumes:
      - name: ssh-key
        secret:
          defaultMode: 384
          optional: false
          secretName: prefect-ssh-key
    It's not keeping the same permissions for 
    defaultMode
      and I think that's why I'm getting an error about 
    Failed to add the RSA host key for IP address '140.82.114.4' to the list of known hosts
     It is using the correct secret name though. Why would it not take the correct defaultMode?
    k
    x
    • 3
    • 17
  • s

    Santiago Gonzalez

    11/03/2021, 9:46 PM
    Hey. I have an issue. After click on manual step, the flow has been stuck for 40 minutes. Is there anything we can do to wake the process?
    a
    m
    • 3
    • 19
  • j

    jack

    11/04/2021, 1:47 AM
    When running flows on Amazon ECS, must the container that runs the flow be based on the PrefectHQ/prefect docker image? I've been attempting to use ECSRun() with S3 storage and a docker image based on amazonlinux:2, but the ECS log shows this error each time:
    /bin/sh: prefect: command not found
    The Dockerfile used to build the image based on amazonlinux:2 runs
    pip install prefect[aws]
    , and when I run the image locally
    prefect
    is in the path.
    k
    a
    • 3
    • 26
  • c

    Chris Arderne

    11/04/2021, 10:07 AM
    I have a quick question about naming Flow runs. The docs talk about naming Task runs, and I found some stuff on GitHub about using a
    state_handler
    to name a flow run based on context, but haven't figured out how to name it based on passed parameters. Eg if I ran a Flow with a
    animal=cat
    I'd like that Flow run (as viewed in the UI) to be named
    run-cat
    or something…
    a
    • 2
    • 3
  • m

    Mike Lev

    11/04/2021, 2:59 PM
    hey once again when executing a
    LocalRun
    with
    LocalExecutor
    how can I do the equivalent of
    sys.path.append
    to the run config working dir currently flows are working without a backend but then when I start to run on my server I get a an error with
    ModuleNotFound
    currently my structure is as such
    MainProject/
    |-coreLogicModule/
    |-PrefectStack/
        flows/    -need access to coreLogic
        runflows.py
    k
    a
    • 3
    • 10
  • l

    Lawrence Finn

    11/05/2021, 12:09 PM
    Why is prefect moving from a graphql API to REST based?
    👀 2
    c
    • 2
    • 1
  • l

    Lawrence Finn

    11/05/2021, 12:32 PM
    What would the graphql query look like to filter flow runs on parameters? More generally, how do you filter on a json field?
    a
    • 2
    • 9
  • e

    ek

    11/05/2021, 3:09 PM
    Is it possible to increase zombie killer from 10 mins to 15 mins for prefect-server?
    k
    • 2
    • 6
  • l

    Lawrence Finn

    11/06/2021, 1:07 PM
    I’ve been playing with orion, running into some issues: 1. lots of sqlite sqlalchemy errors 2. temp local dask doesnt work
    🙏 1
    a
    m
    • 3
    • 64
  • a

    Adam Everington

    11/08/2021, 4:15 PM
    Hey guys, i'm wanting to provision a postgres db in azure to host my prefect meta data. What would the minimum requirements be? It seems based on what it's doing a 1 vcore (2gb ram) + 50gb storage should be sufficient?
    k
    • 2
    • 2
  • l

    Lawrence Finn

    11/09/2021, 1:03 PM
    Another orion question, im now playing with deployments but the runs dont seem to be executing correctly. I’m seeing
    08:01:52.893 | Submitting flow run 'f37fba9f-4280-431d-9f43-889e53192f24'
    08:01:52.894 | Completed submission of flow run 'f37fba9f-4280-431d-9f43-889e53192f24'
    08:01:52.901 | Finished monitoring for late runs.
    08:01:54.187 | Flow run 'f37fba9f-4280-431d-9f43-889e53192f24' exited with exception: KeyError('__main__')
    a
    m
    • 3
    • 8
  • a

    Adam Everington

    11/09/2021, 3:00 PM
    Hiya all, i'm trying to run prefect server through the provided helm chart, changing the postgres values for a provisioned postgres instance in azure. My command is as below:
    helm install prefectprod prefecthq/prefect-server -n prod --set agent.enabled=true --set postgresql.postgresqlUsername=admin%40my-server --set postgresql.postgresqlPassword=p%40ssword123 --set postgresql.externalHostname=<http://my-server.postgres.databases.azure.net|my-server.postgres.databases.azure.net> --set postgresql.useSubChart=false
    when looking at my pods I'm getting the following
    a
    m
    • 3
    • 25
  • p

    Prasanth Kothuri

    11/09/2021, 6:30 PM
    Hello all, I am using prefect docker agent, for the flow run containers I need to expose some ports, how can I do this ? e.g
    -p 12001:12001
    k
    • 2
    • 10
  • p

    Prasanth Kothuri

    11/09/2021, 6:43 PM
    is it with
    host_config
    of
    DockerRun
    ?
  • p

    Payam Vaezi

    11/09/2021, 9:32 PM
    After upgrading prefect core and server to
    0.15.7
    from
    0.14.22
    we are seeing this error on registration of flow to the server. Any idea why this is happening? Has API contract changed? Also tagging @David Harrington for visibility. Traceback in thread.
    k
    m
    • 3
    • 7
  • r

    Ryan Sattler

    11/10/2021, 12:34 AM
    Hi - my previously working kubernetes prefect server setup seems to have broken somehow. When I submit jobs via the UI, the agent logs say that the job has been successfully submitted (
    Completed deployment of flow run
    ), but no job container ever appears in k8s and the UI just hangs forever at “Submitted for execution”. Restarting the agent does not help. Does anyone know how to debug this?
    m
    • 2
    • 3
  • e

    Ege Demirel

    11/10/2021, 2:25 PM
    Good morning all! I'm unfortunately facing a constraint where our Infra team is not supporting Docker or Kubernetes in any shape or form. To that end, I think we're pretty much locked to Local Agents in some shape or form. Has anybody has experience working with this constraint and share their experiences? Especially in environments where we want to have separate requirements per project.
    k
    a
    • 3
    • 4
  • p

    Pedro Martins

    11/10/2021, 6:32 PM
    Hey guys! Is there a way for the Task to know the
    flow_run_id
    during runtime? I will run this flow many times and for run I’d like to save the
    flow_run_id
    associated with it along with other metadata.
    k
    • 2
    • 2
  • p

    Pedro Martins

    11/10/2021, 6:34 PM
    Another question, how can I spin up a prefect server just to run unit-tests? Is there any fixture that does this for me?
    k
    • 2
    • 6
  • k

    Kelby

    11/11/2021, 2:40 PM
    Good Morning. In Orion (Prefect 2.0a4), I’m starting to test out subflows. I’m noticing that when using a subflow as a future added to a task’s
    wait_for
    argument, Orion seems to ignore the wait_for and processes the tasks even when the subflow fails.
    a
    m
    • 3
    • 19
  • j

    jack

    11/11/2021, 5:26 PM
    Anybody else using
    FlowRunView
    to access the logs of a flow run? Sometimes it works for me, but other times I keep seeing stale logs.
    k
    m
    • 3
    • 20
  • s

    Sylvain Hazard

    11/12/2021, 7:56 AM
    Hi ! Coming in at work today, it seems like our Prefect server is broken and I don't have the slightest clue as to why. We're running the server on Kubernetes with a PostgreSql database. Logs that seem relevant are in the thread below.
    a
    • 2
    • 13
  • s

    Sylvain Hazard

    11/15/2021, 9:47 AM
    Hello there ! Is there any clean way to have a different executor configuration for some tasks of a given Flow ? The use case is that we have flows with a few tasks being very resources-hungry and we would like to reduce our overall resources consumption by running those hungry tasks on bigger nodes than the other tasks. I suspect it might not be possible out of the box and require us to split tasks into different flows and then using a flow of flows but it does not sound good. We are using a KubernetesRun with a DaskExecutor spawning a dask cluster on our k8s. Thanks !
    a
    • 2
    • 5
  • a

    Aqib Fayyaz

    11/15/2021, 11:33 AM
    i have cloned the repo https://github.com/PrefectHQ/server and installed the dependencies and than ran the command prefect server start and prefect server is running on localhost port 8080. Now how can i register flow?
    a
    k
    • 3
    • 3
  • p

    Pedro Martins

    11/15/2021, 11:57 AM
    Hey guys! Can anyone help me address the following error in the Local Agent:
    raise KeyError(
    KeyError: 'Task slug RunSimulationTask-1 is not found in the current Flow. This is usually caused by a mismatch between the flow version stored in the Prefect backend and the flow that was loaded from storage.\n- Did you change the flow without re-registering it?\n- Did you register the flow without updating it in your storage location (if applicable)?'
    simulation_flow.py
        SimulationFlow = Flow(
           "RunSimulationFlowDemo-3",
            storage=Local(stored_as_script=True, path=__file__),
         )
    ----
    simulation_task.py
    class RunSimulationTask(Task):
        def __init__(
            self,
            simulation_run_parameters: SimulationRunParameters,
            simulation_gateways: SimulationGateways,
            factory_scheme_gateways: FactorySchemeGateways,
            map_gtw: MAPDataGatewayBase = None,
            **kwargs,
        ):
            super(RunSimulationTask, self).__init__(name="RunSimulationTask")
            self.simulation_run_parameters = simulation_run_parameters.to_dict()
            self.simulation_gateways = simulation_gateways
            self.factory_scheme_gateways = factory_scheme_gateways
            self.map_gtw = map_gtw or MAPDataGateway()
    
        def run(self) -> str:
            flow_run_id = prefect.context.get("flow_run_id")
            <http://self.logger.info|self.logger.info>(f"flow_run_id: {flow_run_id}")
    
            simulation_controller = SimulationController(
                **self.factory_scheme_gateways.to_dict(),
                **self.simulation_gateways.to_dict(),
                map_gtw=self.map_gtw,
            )
    
            <http://self.logger.info|self.logger.info>(f"Run simulation task. [{self.simulation_run_parameters}]")
            simulation_controller.run(
                **self.simulation_run_parameters, flow_run_id=flow_run_id
            )
    
            return flow_run_id
    
    ----
    simulation_manager.py
    
            task = RunSimulationTask(
                simulation_run_parameters=simulation_run_parameters,
                simulation_gateways=self._simulation_gateways,
                factory_scheme_gateways=self._factory_scheme_gateways,
                map_gtw=self._map_gtw,
            )
            SimulationFlow.add_task(task)
    
            SimulationFlow.run_config = self.flow_config.run_config
            SimulationFlow.executor = self.flow_config.executor
            SimulationFlow.state_handlers = self.flow_config.state_handlers
    I’m trying to register and run this flow using the
    client.register()
    and
    client.create_flow_run()
    . One weird thing I noticed is that the Flow is not being saved to my local
    ~/.prefect/flows
    directory. Could be that?
    a
    • 2
    • 6
  • l

    Lukas Brower

    11/15/2021, 3:43 PM
    Hey everyone, I have a quick question around mapped tasks. We have a few mapped tasks whose
    mapped
    field has recently started always showing up as false. Example response for one of our mapped tasks from the interactive API:
    "task": [
          {      
            "id": "ad65f438-062b-4f9d-ba8e-826edce68d74",
            "name": "<task_name>",
            "mapped": false,
            "task_runs": [
              {
                "name": null,
                "created": "2021-11-11T01:08:13.326713+00:00",
                "map_index": -1,
                "version": 2
              },
              {
                "name": "<task_run_0>",
                "created": "2021-11-11T01:31:55.062178+00:00",
                "map_index": 0,
                "version": 3
              },
              {
                "name": "<task_run_1>",
                "created": "2021-11-11T01:31:55.062178+00:00",
                "map_index": 1,
                "version": 3
              }
            ]
          },
    We have some logic which depends on the value of mapped. We could use
    map_index != -1
    in place of this I guess, but we’re trying to determine why the
    mapped
    field is always false despite there being mapped
    task_runs
    k
    • 2
    • 3
  • j

    jack

    11/15/2021, 6:26 PM
    Is it better to create the (ECS) Task Definition ahead of time? When I created 25 flow runs that run on ECS / Fargate, got this error:
    Failed: "An error occurred (ThrottlingException) when calling the DeregisterTaskDefinition operation (reached max retries: 2): Rate exceeded"
    k
    m
    • 3
    • 3
  • r

    Rob Fowler

    11/16/2021, 1:19 AM
    A project Orion question, I seem to recall there was going to be support for wrangling long running tasks? I have changed company and we have applications that are started and run for a day, sometimes 3, but have a range of dependancies, at the moment in some crazy complicated shell scripts that serially start thing that can take a long time. I already wrote an app that starts many of them in parallel and monitors the output using filesystem watches but there is quite a complicated DAG of things to run. Many of them fork and exit and I am considering them done when I see a message in a file or journald _comm watch message. (They are not actually done, but other processes can be run then). Ideally I'd like to express the existing 5000 line shell script of a prefect flow.
    k
    • 2
    • 3
Powered by Linen
Title
r

Rob Fowler

11/16/2021, 1:19 AM
A project Orion question, I seem to recall there was going to be support for wrangling long running tasks? I have changed company and we have applications that are started and run for a day, sometimes 3, but have a range of dependancies, at the moment in some crazy complicated shell scripts that serially start thing that can take a long time. I already wrote an app that starts many of them in parallel and monitors the output using filesystem watches but there is quite a complicated DAG of things to run. Many of them fork and exit and I am considering them done when I see a message in a file or journald _comm watch message. (They are not actually done, but other processes can be run then). Ideally I'd like to express the existing 5000 line shell script of a prefect flow.
maybe I am thinking this wrong. Maybe a flow should be just the short running processes and the for long running processes are ended when my sentinel logic confirms they are runnng.
k

Kevin Kho

11/16/2021, 2:56 AM
Hey @Rob Fowler, congrats on the job change! On the first post, I feel like it can be done in current Prefect. We can have long running tasks. You can construct this as a Prefect flow using the ShellTask and reading the outputs of the ShellTask, and then doing those forks and exits. With regards to Orion, there is nothing there is supporting the long running task execution. But, it may still be a better fit because of the better handling of subflows, allowing you to have more modular code. That 5000 line shells script can likely be broken up much better. On the second comment, the Flow being a short running process, that can also work. On the Prefect side, we don’t care about execution time anyway. Just an idea, but you can maybe also use manual triggers to check on the status of Flows and trigger the continuation from day to day.
r

Rob Fowler

11/16/2021, 3:13 AM
Thanks for the advice. I was very used to the flows in 'current' prefect but I would add, I often had people stumble on the way it worked in the context manager. The new system is more pythonic and it seems too easy now 🙂 Ye olde 'ShellTask' never survived more than the prototype and MCS but it was a good basis for lots of things. You are right about native subflows being a great improvement for this shell script replacement. I had many subflows before but they mandated having a full prefect server up and were never as simple as running things in the local command line for tests. The more I think about these long running processes the more I think they can be represented, at the very least, by two graphs, one for starting and one for stopping.
👍 2
View count: 1