• c

    Christopher Stokes

    2 years ago
    this is on the most recent release
  • c

    Christopher Stokes

    2 years ago
    is that intended? if so, is there a way to skip the SKIP signals?
    c
    Chris White
    9 replies
    Copy to Clipboard
  • t

    Tobias Schmidt

    2 years ago
    I feel like this must be pretty basic: Is there a way to combine/string together multiple flows?
    t
    Zachary Hughes
    +4
    11 replies
    Copy to Clipboard
  • j

    Jeff Yun

    2 years ago
    How do I map over the product of two parameters? I consulted the docs and issues (notably https://github.com/PrefectHQ/prefect/issues/1311 and https://github.com/PrefectHQ/prefect/issues/674), but still couldn't figure it out. Some attempts:1) Passing a task into a "compose" task as a parameter --> result = compose.map(ids, unmapped(generate_task), unmapped(dates)) --> got
    TypeError: generate_task() missing 2 required positional arguments:
    2) Calling a known task from a "compose" task --> got
    ValueError: Could not infer an active Flow context.
    3) Typecasting parameter to list --> Parameter is not iterable
  • j

    Jeff Yun

    2 years ago
    Related to my above question, how do I access the value of a parameter? For example, I want to use
    flattened_lists = itertools.chain(*nested_lists_param)
    but can't iterate through the Parameter itself.
    j
    Chris White
    5 replies
    Copy to Clipboard
  • j

    Jeff Yun

    2 years ago
    Dask question here: I am new to distributed systems, and want to run a large number of tasks (N = 10k+) in parallel. Afaik, Prefect+Dask is scalable to 10k+ parallel tasks (although I know it's ideal to batch many small tasks to few longer-running tasks if possible). However, trying different small toy tasks (and various combinations of
    --nprocs
    --nthreads
    on workers), it seems that consistently- Running Client() locally starts immediately, as expected. - Running with one worker server, starting takes much longer as N increases For large N:
    [2019-09-30 22:59:15,781] INFO - prefect.TaskRunner | Task 'stage_0': Starting task run...
    distributed.utils_perf - INFO - full garbage collection released 561.93 MB from 0 reference cycles (threshold: 10.00 MB)
    distributed.core - INFO - Event loop was unresponsive in Worker for 34.31s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
    - Running on multiple servers, the Dask scheduler also takes much way longer than the expected ~1ms overhead/task, before any activity happens (on the client servers or the Dask scheduler dashboard). Why is scheduling taking so long? How could I adjust scheduling policy (https://distributed.dask.org/en/latest/scheduling-policies.html) to speed up running large number of small tasks?
    j
    Chris White
    +3
    60 replies
    Copy to Clipboard
  • g

    Gregor Müllegger

    2 years ago
    Hi folks, I'm trying Prefect the first time and having a few questions regarding best practices that I didn't get a hold of in the documentation. 1. Logging: I'm extracting ~10k IDs out of a database and return that as a list of a task. The next task will map over it and continue working on them. However that list is printed in the logs completely, but I don't see value for that big chunk of log. Are there ways of disabling printing the result of a task in the logs for a single task without disabling logging in general? Or am I misusing tasks and should better return a generator instead of a list? 2. How to integrate a Prefect Flow into a CLI? I currently have a Python CLI tool that takes a database connection string (where to get the IDs from mentioned above) and processes this data with a plain loop and want to migrate that to Flow. How do I now call the flow in the CLI script in order to make the CLI script behave as a CLI should, most importantly the a exit code should be not 0 when the flow state is not successfull. Is there a guide for best practices of using prefect in CLIs?
    g
    emre
    3 replies
    Copy to Clipboard
  • g

    Garrett Thomas

    2 years ago
    Hi guys! The project looks great so far and me and my team are investigating the possibility of using it instead of airflow. I looked through the docs and skimmed through the code and code not find any authentication/access management like the kind that airflow has in prefect-core. Is this correct? and is there any plan to implement in the open sourced project? Thanks a lot for your time and I am excited to see how the project evolves!
    g
    1 replies
    Copy to Clipboard
  • j

    Jeff Yun

    2 years ago
    If a Parameter consists of e.g. a list of length 100k, at what stage and how may it be batched with dask.Bag?
    j
    Chris White
    29 replies
    Copy to Clipboard
  • emre

    emre

    2 years ago
    Hey all, I have been using local secrets in my production flows for a while. I want to use some other key-value store for my secrets, as it is the secure and intended way. I haven’t found much in the docs, how would I be telling Prefect Core where and how it should access this new
    Secrets
    store, say AWS Secrets Manager
    emre
    j
    +1
    5 replies
    Copy to Clipboard