https://prefect.io logo
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • m

    Marwan Sarieddine

    05/27/2020, 2:02 PM
    Question regarding the kubernetes agent - is their a programmatic way to “remove”/“de-register” an agent from the cloud ? I would have expected to do so using the
    prefect agent
    command in the cli …
    k
    w
    n
    • 4
    • 27
  • d

    Darragh

    05/27/2020, 3:43 PM
    Question regarding the level of granularity for Flows/Tasks - we’re moving our Flows to FargateTaskEnvironment, and we’re looking at using the Prefect task mapping to split out partitions of work. I know these can be split out by tricks like using the Docker based task, but those tasks would still be part of the Fargate Task that created them - ideally we want to be able to trigger these as either subflows [don’t think that’s available yet?] or as a Fargate Task in it’s own right. Do you know if anyone has done this? We want to utilise the map/reduce as much as possible, and to get best use out of semi-serverless aspect of Fargate, but right now the only way I can think of is essentially manually overriding a @task definition to be a @MyFargateTask. Is there any other method that users have tried?
    👀 1
    n
    k
    j
    • 4
    • 12
  • i

    itay livni

    05/27/2020, 5:04 PM
    Hi - What is the suggested
    target
    pattern for a
    task
    that is called twice in the same flow but without mapping? I currently am mucking around with this pattern below and thought of using tags to differentiate them.
    duplicate_task_target = "{parameters[<A FLOW_PARAM>]}/{task_name}-{???}"
    But that would mean keeping track of duplicate tasks ... which is burdensome when working on multiple flows that then get updated. Any other thoughts? Aside: I think this goes to param based targeting mentioned in other threads.
    c
    k
    • 3
    • 9
  • a

    Alex Welch

    05/27/2020, 5:08 PM
    when using the docker container, is there a way to not jump into python when it is launched?
    c
    • 2
    • 17
  • i

    itay livni

    05/27/2020, 6:55 PM
    Hi - I'd like to confirm that using
    task_run_id
    as a target is a bug using 0.11.3
    from prefect import task, Flow
    from prefect.engine.results import LocalResult
    
    lcl_res = LocalResult(dir="~/prefect_guide/results/{flow_name}")
    
    @task(target="{task_name}/{task_run_id}",
    )
    def return_list():
        return [1, 2, 3]
    @task(target="{task_name}/{map_index}.prefect")
    def mapped_task(x):
        return x + 1
    with Flow("blah", result=lcl_res) as flow:
        mapped_task.map(return_list)
    
    
    st = flow.run()
    flow.visualize(flow_state=st)
    n
    c
    • 3
    • 7
  • a

    Adam Roderick

    05/27/2020, 7:29 PM
    Does Docker storage attempt to authenticate with the docker registry? I'm getting a "no basic auth credentials" error from
    flow.register()
    n
    • 2
    • 2
  • a

    Adam Roderick

    05/28/2020, 4:51 AM
    Hi, I'm trying to work through flow registration with cloud using Docker storage
    c
    a
    • 3
    • 17
  • i

    Ivan Shumilin

    05/28/2020, 5:22 AM
    Hello! Does Prefect has notion of priority? Something like priority_weight in Airflow. So I could raise some flow runs.
    j
    • 2
    • 3
  • a

    Arsenii

    05/28/2020, 6:09 AM
    Where can I submit small Cloud UI-related bugs like this? Clicking on
    Cloud Hooks
    here opens https://docs.prefect.io/orchestration/concepts/cloud-hooks.html while the correct URL is https://docs.prefect.io/orchestration/concepts/cloud_hooks.html (notice the
    _
    instead of
    -
    )
    j
    m
    n
    • 4
    • 4
  • r

    Rafal

    05/28/2020, 1:40 PM
    Do you know where this error comes from?
    j
    y
    p
    • 4
    • 26
  • m

    Marwan Sarieddine

    05/28/2020, 3:46 PM
    Hello everyone - I think I found two issues in
    prefect install agent kubernetes
    •
    --namespace
    only applies to the rbac in the generated manifest but not to the deployment - [i.e agents always gets deployed to default namespace] • Can’t specify a
    --name
    to the cli call - so the agent’s name always gets registered as
    agent
    - apparently this is done in
    prefect agent start
    - I stand corrected
    👀 1
    d
    • 2
    • 9
  • m

    Michael Reeves

    05/28/2020, 4:18 PM
    Hi all, I've started using prefect and I think its a great api for managing workflows (I see it as a human friendly wrapper around dask orchestration). Thanks for excellent documentation and a super useful tool! I'm trying to determine how best to use prefect within an orchestration workflow design. The workflow is essentially three generic steps: 1. spinning up various services (various web servers and processes) 2. manage these services: state, variables/properties, tasks (prefect seems to fit this, although I may need to find a good way to store state/data) 3. detect failure/errors in services and trigger redundancy/recovery tasks I am torn on how to use prefect with 1 and 3. I don't think using prefect alone is the best solution, but I also think prefect(dask scheduler/executor) is a good solution for starting services (1) and handling triggers (3). However I'm not sure if prefect is ideal for active monitoring. Is there a prefect usecase for monitoring log files/services, or if there is a better tool for this job? I didn't see any mention of monitoring within the prefect core. I'm not talking about monitoring specific prefect tasks which is obvious that prefect should monitor its own tasks, more so on monitoring errors/issues in services. If prefect isn't ideal for active monitoring of services, I envision using something like prometheus to watch data/logs from the services and trigger redundancy/recovery tasks in prefect if prometheus alerts or detects failures/issues. Are there any ideas for a better integration of prefect into this workflow?
    d
    z
    • 3
    • 6
  • g

    Geoffrey Gross

    05/28/2020, 8:44 PM
    Hey everyone got a question. I am currently working on building a flow via the
    Docker
    storage class. Right now my flow imports env vars from a
    config.py
    , found in the same directory with the file that holds my flow, which uses
    os.environ
    . I would like to set those environment variables at run time but when, the healtchecks run and deserialize my flow in the docker build step, it is trying to evaluate those environment variables. This causes an error since I don't have defaults set. Does anyone have any ideas of have to handle this without ignoring healthchecks and putting environment variables into the Dockerfile that gets generated?
    d
    • 2
    • 15
  • s

    Scott Zelenka

    05/28/2020, 10:29 PM
    When using a KubernetesExecutor, is it possible to have dynamic resource allocation? We can currently use the
    job_spec
    file to specify the K8 resources. But we have a use case where, in most situations the volume of work expected from our Flow fits within the specified resources, occasionally an input Parameter to the Flow will be such that it requires more resources. We're getting by this today by setting the resources to the maximum value, but (in theory) that restricts the number of concurrent Flows we can execute in the same environment. Curious if there's an ability to adjust the memory resources required in the
    job_spec
    based on the value of an input Parameter to the FlowRun?
    j
    • 2
    • 1
  • d

    Darragh

    05/29/2020, 10:22 AM
    Morning all! Curiosity question, anyone tried cross cloud platform usage? I.e. host Prefect server and agent on AWS and run a flow on Digital Ocean?
    n
    • 2
    • 5
  • m

    Matthias

    05/29/2020, 12:39 PM
    Hey 🙂 I am using a reverse proxy for accessing the UI, so my UI is accessible at
    <https://domain.local/prefect>
    . Is there a setting to tell the UI to load the CSS from
    <https://domain.local/prefect/css>
    instead of trying to get it from
    <https://domain.local/css>
    ?
    👀 1
    n
    l
    • 3
    • 4
  • c

    Chris Hart

    05/29/2020, 1:28 PM
    https://news.ycombinator.com/item?id=23349507 a thread rn that looks like a nice place to promote Prefect
    :upvote: 3
    j
    • 2
    • 4
  • i

    itay livni

    05/29/2020, 1:55 PM
    Hi - I have two flows that I am trying to merge into one using
    update
    . They flows are located in separate modules and have some duplicate parameters. Unfortunately the example pattern for joining flows with duplicate parameters only compiles if all the duplicated parameters are removed from their respective modules. Which makes multiple flow development difficult. Is there another pattern or solution where flows can be developed independently and then merged? --thanks https://stackoverflow.com/questions/60679595/how-does-one-update-a-prefect-flow-with-duplicate-parameters .
    n
    p
    • 3
    • 3
  • m

    Matthias

    05/29/2020, 2:00 PM
    I have two more questions running Prefect locally: • Is it possible to edit a schedule from the UI? • Is it possible to give tasks a name when they are called (not when the task class is instantiated) so I can distinguish them in the UI?
    n
    • 2
    • 1
  • n

    Noah Nethery

    05/29/2020, 2:55 PM
    Hello Prefect community, I’m using a K8s agent with a Dask K8s environment and I’m finding about 1 minute pass by when a run state goes from “Scheduled” to “Running.” Are there any ways to can speed up scheduling? Maybe by configuring the agent?
    👋 1
    j
    • 2
    • 2
  • d

    Dan DiPasquo

    05/29/2020, 3:51 PM
    When two Flow runs share same name, it appears that getting logs via prefect CLI shows only the logs from more recent run, is that right? Is there a way to access the logs from earlier run with same name via CLI?
    j
    • 2
    • 10
  • h

    Hassan Javeed

    05/29/2020, 4:21 PM
    Hi All, Is there a way to configure 'Flow Run name' for a flow that's set to run on a schedule ?
    c
    • 2
    • 3
  • w

    Will Milner

    05/29/2020, 6:05 PM
    Is it possible to trigger a flow run after a different flow has finished?
    j
    • 2
    • 2
  • g

    Gridcellcoder

    05/29/2020, 7:50 PM
    Hi Team, how do I stop the prefect docker images
    prefecthq/ui:0.11.2
    prefecthq/apollo:0.11.2
    etc from starting on boot. I.e uninstall?
    j
    • 2
    • 10
  • s

    Slackbot

    05/30/2020, 7:38 AM
    This message was deleted.
    a
    n
    • 3
    • 2
  • a

    Avi A

    05/30/2020, 10:56 AM
    Dask Scheduler: I’m using a
    LocalDaskScheduler
    with the default
    scheduler=threads
    but it only runs one task at a time (mapped tasks). Am I missing some extra argument that allows tasks to run concurrently?
    • 1
    • 1
  • p

    Pedro Machado

    05/30/2020, 5:39 PM
    Hi everyone. I am trying to get a better understanding of mapped tasks and memory implications/best practices when using a large number of them. I have a pipeline that 1) queries a DB for a list of app IDs. It usually gets about 25k 2) Calls an API for each app N times where N is currently 14 different countries. The json response is not too big for each app. A single dictionary with several columns. 3) combines and stores the output of all tasks in s3 This is currently implemented in Airflow with one branch per country. Each branch queries the API repeatedly for each app and stores the results in a single file for each country. If the task fails, all apps need to be reprocessed. What I am wondering is: * If I create a mapped task that gets the list of apps + countries, does it create all the 350k (14 * 25k) child tasks in memory at once and they are put in some sort of queue or are they lazily created? * I suppose that if I did nothing special regarding caching the results to an external system like s3, it would hold all the data in memory until it gets to the reducer task that dumps the output to a file. This may require a lot of memory because the reducer won't start until all children finish. Correct? * Would this be alleviated if I use caching to s3? Would the memory be released once each task results are persisted to s3? * Each child task output would be pretty small and it seems that having that may s3 files with a little data in each is not great. Would you recommend that instead of having the child task process a single app, it processes a small batch of apps say 50? * I suppose there is no garbage collection on persisted results. Is the recommendation to use s3 life cycle rules to clear old task outputs? Thanks!
    c
    m
    • 3
    • 6
  • a

    Avi A

    05/30/2020, 10:00 PM
    Hey community! I’m having a problem with
    LocalDaskExecutor
    . I keep getting the following error messages, which are probably related:
    Error message: can't start new thread
    Error message: 'DummyProcess' object has no attribute 'terminate'
    BlockingIOError: [Errno 11] Resource temporarily unavailable
    a
    • 2
    • 4
  • j

    jars

    05/30/2020, 10:43 PM
    Quick question - docs (https://docs.prefect.io/api/latest/environments/storage.html#docker) say the Docker parameter:
    secrets(List[str], optional)
    a list of Prefect Secrets which will be used to populate 
    prefect.context
     for each flow run.
    Used primarily for providing authentication credentials.
    It's not clear how I can access these secrets inside my flow. I've tried, exploring the
    prefect.context
    object, but can't seem to find anything. Any examples or guidance?
    c
    • 2
    • 5
  • s

    Sumant Agnihotri

    05/30/2020, 11:55 PM
    Hi all. I'm new to Prefect (and development). I had a doubt. I understand that one can use prefect to set up flows that run tasks in a particular order among other things. Does it also implement queueing? So far in my projects I've only used redis-celery to run big tasks in a queue. PS, I've mostly worked in web dev.
    j
    • 2
    • 4
Powered by Linen
Title
s

Sumant Agnihotri

05/30/2020, 11:55 PM
Hi all. I'm new to Prefect (and development). I had a doubt. I understand that one can use prefect to set up flows that run tasks in a particular order among other things. Does it also implement queueing? So far in my projects I've only used redis-celery to run big tasks in a queue. PS, I've mostly worked in web dev.
j

Jeremiah

05/30/2020, 11:56 PM
Hi @Sumant Agnihotri, welcome! Prefect is not a queueing framework, in that it doesn’t offer you fine-grained control over its queueing mechanisms. It attempts to run work as close to its scheduled time as possible. However, work is queued in the sense that if you don’t have workers or resources available to handle a scheduled job, it will not be lost; once resources become available, your agents can pick it up.
So if the semantic you’re looking for is not losing work by maintaining a queue of tasks, Prefect will work for you. If you’re looking to build a queueing system with direct control of the queue itself, you may prefer a different framework (or to use your own queue to trigger a Prefect job via API)
s

Sumant Agnihotri

05/31/2020, 4:54 AM
@Jeremiah Thanks a lot for the detailed reply. Yea, I don't need direct control of the queue. I have a web app with flow as follows: downloading data from user, running it through a bunch of machine learning models and then generating graphs and stuff. Problem I was facing right now is that if more than 5 instances of this flow run concurrently, it crashes the server. I think prefect can help me out here.
j

Jeremiah

05/31/2020, 3:30 PM
Hi Sumant, yes, I think Prefect can help. In the near future, we’ll be adding more features around limiting flow concurrency through the API, which would help you even further in this instance (you could, for example, only allow 1 flow to run at a time)
❤️ 1
View count: 1