• m

    Matt Alhonte

    4 months ago
    What does it mean when a Task has the
    Running
    state but also has an
    End Time
    and the
    Duration
    isn't ticking up? Example:
    m
    Kevin Kho
    6 replies
    Copy to Clipboard
  • Jonathan Mathews

    Jonathan Mathews

    4 months ago
    Hi, to use Gitlab storage, do I have to provide my personal access token, or can I use a repository specific deploy token? As far as I can see, a personal access token gives read/write access to all of my repositories. Perhaps a project-level access token would work, but I see the docs refer to a personal access token: https://docs.prefect.io/orchestration/flow_config/storage.html#gitlab
    Jonathan Mathews
    Kevin Kho
    5 replies
    Copy to Clipboard
  • James Phoenix

    James Phoenix

    4 months ago
    We are mapping over a series of requests, to make multiple tasks within the same flow, • We want to set the task state to be failed/pending for each task, after it has run. • Then later we want to update the task state to be succeeded via the graphQL/rest API

    • We are using Prefect cloud as the API back_end.

    • Currently what python methods/GraphQL should we use to update the state of individual task runs inside of flow runs?
    James Phoenix
    Anna Geller
    11 replies
    Copy to Clipboard
  • James Phoenix

    James Phoenix

    4 months ago
    Or how to make a task that runs after a new blob arrives in a google cloud storage bucket?
    James Phoenix
    Anna Geller
    2 replies
    Copy to Clipboard
  • David Evans

    David Evans

    4 months ago
    Hi, I'm looking to set up some Prefect-agents-as-a-service within my organisation. We're planning to use the hosted Prefect Cloud with some AWS ECS agents, and a GitHub Actions pipeline to publish new flows (probably using S3 as storage). This is all greenfield so we're pretty flexible on the details. Right now we're sticking to Prefect 1.x but if there's a good reason to jump to 2.x I don't think that would be a problem (as long as it's safe/secure we don't mind the occasional hiccough while it's still in beta) Most of that is fine; so far we're just using a local agent deployed on EC2, but I can see there's an ECS agent which we can presumably use easily enough to get any scalability that we'll need, and we can use the
    prefect
    CLI to push flows from GitHub Actions. But where we're hitting problems is with dependency management (both internal code which is shared between multiple tasks/flows, and external dependencies). From what I've seen, Prefect doesn't really support this at all (flows are expected to be self-contained single files), with the implication being that the agent itself has to have any shared dependencies pre-installed (which in our case would mean that any significant changes require re-building and re-deploying the agent image - a slow process and not very practical if we have long-lived tasks or multiple people testing different flows at the same time). I tried looking around for Python bundlers and found stickytape, but that seems a bit too rough-and-ready for any real use. This seems to be a bit of a known problem: 1, 2 and specifically I see:
    V2 supports virtual and conda environment specification per flow run which should help some
    And I found some documentation for this (which seems to tie it to the new concept of deployments), but I'm still a bit confused on the details: • would the idea be to create a deployment for every version of every flow we push? Will we need to somehow tidy up the old deployments ourselves? • can deployments be given other internal files (i.e. common internal code), or is it limited to just external dependencies? Relatedly, do deployments live on the server or in the configured Storage? • is there any way to use zipapp bundles? • ideally we want engineers to be able to run flows in 3 ways: entirely locally; on a remote runner triggered from their local machine (with local code, including their latest local dependencies); and entirely remotely (pushed to the cloud server via an automated pipeline and triggered or scheduled - basically "push to production") — I'm not clear on how I should be thinking about deployments vs flows to make these 3 options a reality. I also wonder if I'm going down a complete rabbit hole and there is an easier way to do all of this?
    David Evans
    Anna Geller
    9 replies
    Copy to Clipboard
  • David Evans

    David Evans

    4 months ago
    🧵 for dependency questions from above: Ideally we're looking to keep our flows minimal, so virtual environments would be preferable over full docker images, but that's not a strict requirement. If docker images will get us what we need, we can work with that. Presumably we'd have to create a docker image (based on
    prefecthq:prefect
    ? or would it just need
    python:3
    ?) for each flow? And I guess these docker images would run
    pip install -r requirements.txt
    as a build layer. But if we can achieve this with a virtual environment instead I think that would be preferable (I'm thinking in terms of the flow needed for an engineer to try something out by pushing it to the runner from their local machine) (I can see the high-level concept here but I'm struggling to see how it will look in practice for the various use-cases)
    David Evans
    Anna Geller
    11 replies
    Copy to Clipboard
  • davzucky

    davzucky

    4 months ago
    Did active try to use Yugabyte instead of postgres with Prefect Server 1.x? Looking to get more resiliency and hot hot availability.
    davzucky
    Anna Geller
    8 replies
    Copy to Clipboard
  • b

    Baris Cekic

    4 months ago
    hey here , what is the best practice to generate flows dynamically on runtime and then register to server ? We are using in-house platform for designing ETL logic but after that to orchestrate the tasks in it we want to programatically generate the flows.
    b
    Anna Geller
    10 replies
    Copy to Clipboard
  • Chris Reuter

    Chris Reuter

    4 months ago
  • x

    xyzz

    4 months ago
    If you set up a storage with cloud connected, where are the secrets stored (e.g. AWS secret access key)? on the local machine or on cloud?
    x
    Anna Geller
    +1
    6 replies
    Copy to Clipboard