https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • v

    Varun Joshi

    04/08/2021, 5:46 PM
    I'm not able to see the logs for my flows which I used to be able to previous. Yes, my log_stdout is set to True in task settings. Any reason why this is happening?
    k
    c
    18 replies · 3 participants
  • j

    Joseph Ellis

    04/08/2021, 5:47 PM
    Has anyone got any recommendations on how best to deploy the Prefect agent in AWS? We’re leveraging the ECS Agent to execute flow runs within our ECS cluster (great, that all works fine). But, we’re contemplating whether the agent service itself is best run on EC2 vs. a long-running container in Fargate? Any help appreciated 🙂.
    k
    s
    7 replies · 3 participants
  • b

    Brent Bateman

    04/08/2021, 7:46 PM
    Howdy, does Prefect have any recommended SIs familiar with how to implement Prefect with Snowflake (dbt a plus)?
    k
    4 replies · 2 participants
  • b

    Berty

    04/08/2021, 7:55 PM
    👋 I need a little help with the cloud UI
    ✅ 1
    n
    2 replies · 2 participants
  • d

    dh

    04/08/2021, 11:22 PM
    Howdy, when an Agent triggers a run of a registered flow, is there some standard way to ask Agent to pass some arguments into the flow? e.g. Agent would do:
    registerd_flow.run(**runtime_args_passed_by_agent)
    Or is this disallowed by design? (registered flow shall be self-sufficient for reproducibility) Context: we have some flow that depends on some value that changes quite often (e.g. dependency package version number). We don’t want to register a new flow each time we update the version number; rather have one flow and have the agent run the flow expecting the package version information will be provided to it at run time. We thought about using env var, but not sure if it’s the best way…
    k
    12 replies · 2 participants
  • r

    Rob Fowler

    04/09/2021, 4:26 AM
    damn, seems Dask thread executors are broken again, back to processes and everything is good
    t
    5 replies · 2 participants
  • b

    Brian Keating

    04/09/2021, 5:45 AM
    I've written a
    ResourceManager
    to create and terminate EC2 instances. I test it out with a bare bones workflow:
    @task
    def do_something_on_instance(instance_id):
        prefect.context.get('logger').info(f'Do something on instance {instance_id}')
    
    with Flow('hello-ec2') as flow:
        with EC2Instance('t2.micro') as instance_id:
            do_something_on_instance(instance_id)  # instance_id is a string
    This works correctly when using github storage, but when I switch to S3, the flow fails with
    TypeError: cannot pickle 'SSLContext' object
    . Anyone know what's going on here? Note that the value returned by
    EC2Instance.setup
    is a
    str
    .
    k
    8 replies · 2 participants
  • j

    Joe McDonald

    04/09/2021, 5:52 AM
    Anyone had this show up when using the ECS agent? We are running an ECS cluster with three agents and prefect server in a separate ecs cluster all running 3 instances of all services. It seems in the deprecated fargate agent you could have it create new family for each run, but looks like ECS agent doesn’t do that, it reuses the same family for task definition and increments version so when running a lot of flows at the same time we run into this.
    An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Too many concurrent attempts to create a new revision of the specified family.
    k
    2 replies · 2 participants
  • x

    xyzz

    04/09/2021, 8:54 AM
    I'm a bit irritated about the information on https://docs.prefect.io/orchestration/faq/dataflow.html#gotchas-and-caveats , which claims to be an exhaustive list of data that might end up in the prefect database. It doesn't mention anything about the flow metadata like names of flows and tasks and their configuration like run_config, schedules or storage. So I assume this list isn't actually exhaustive?
    e
    k
    3 replies · 3 participants
  • n

    Noah Holm

    04/09/2021, 9:41 AM
    I’m using S3 storage for flows running with an ECS Agent. By default the flow’s tasks uses S3Result as outlined here. Is there any way to disable the task results for individual or all tasks in the flow while keeping the S3Storage on the flow?
    a
    d
    10 replies · 3 participants
  • m

    Marko Jamedzija

    04/09/2021, 10:54 AM
    Hello 🙂 I recently deployed Prefect Server 0.14.15 on a test k8s cluster. After playing around with it I noticed that k8s agent only runs the flows for the first tenant I created. For the projects I created from other tenants, all their flows are just stuck in
    scheduled
    state (becoming late runs). Is this a bug, or there’s a way of configuring k8s agent to work with more than one tenant? If so, how? Thanks 🙂 (I deployed the services using the prefecthq helm chart)
    k
    x
    6 replies · 3 participants
  • b

    Brent Bateman

    04/09/2021, 1:45 PM
    Who's the admin for this workspace? I want to leave it/ not just sign out of it?
    d
    4 replies · 2 participants
  • j

    Joe McDonald

    04/09/2021, 1:50 PM
    So what is the chance of getting official PrefectHQ docker images hosted on https://public.ecr.aws/ so that ECS services can avoid the docker hub rate limits? Would be nice to convince Hasura to do the same thing…
    k
    5 replies · 2 participants
  • t

    Trevor Kramer

    04/09/2021, 1:55 PM
    I'm confused about how to use @resource_manager effectively. I am using it to create an RDS instance in AWS. I can create it fine in setup and destroy it in cleanup but I don't understand how to get the endpoint out of it. It seems to wrap it in a ResourceManager so I can't access any attributes of my class from the flow. How can I use the client as instantiated here
    with OracleRDS() as client
    ?
    k
    13 replies · 2 participants
  • o

    Oussama Louati

    04/09/2021, 2:16 PM
    Hello 🙂, I deployed Prefect server 0.14.14. Everything works fine locally. I am a bit confused so if anyone can help me: • I have a workload of Machine learning tasks That i would like to execute inside a docker container • I have everything packaged inside a container • I have an agent running on my machine "`prefect agent docker start --show-flow-logs -l agent-name --no-pull`" • When i do
    docker run my_image
    , i get this error each time:
    Failed to load and execute Flow's environment: ModuleNotFoundError("No module named '/root/'")
    Should i run the agent inside the container ? Thank you
    k
    87 replies · 2 participants
  • p

    Paul Prescod

    04/09/2021, 3:00 PM
    This may be outside of the design criteria for Prefect. The use-case consists of two tasks. Task A is CPU-bound. Task B is network bound. Task A generates data that Task B uploads to a slow server. I want a pool of Task A workers to feed a pool of Task B workers. For now, they are all subprocesses of a single parent process. The number of Task A processes should be the same as the number of Cores (not including "hyperthreads"). I'm more flexible about the number of Task B processes.
    k
    8 replies · 2 participants
  • h

    Hygor Knust

    04/09/2021, 3:32 PM
    Is there a way to deploy custom Kubernetes CRDs using the K8s Task Library? I'm trying to deploy a SparkApplication CRD but *i*t seems that there is no task that accomplishes this. Should I use the kubernetes-python inside a Python Task instead? Thanks
    k
    8 replies · 2 participants
  • j

    Jonathan Buys

    04/09/2021, 3:55 PM
    Is there a way to setup SSO with Office365 in Prefect Cloud?
    d
    4 replies · 2 participants
  • x

    xyzy

    04/09/2021, 4:04 PM
    I tried writing a SQLiteConnection ResourceManager as a demo, but when executing with DaskExecutor it leads to problems with having different threads:
    ProgrammingError('SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 8032 and this is thread id 2308.')
    Is there a way to work around this other than using a LocalExecutor? e.g. a parameter for the ResourceManager to tell it to keep everything that uses it in the same thread?
    d
    12 replies · 2 participants
  • j

    Josh

    04/09/2021, 4:27 PM
    Not sure if this should be considered a feature request or discussion idea, but I would hope for updates to patch versions not to introduce breaking changes. I understand we’re still in initial development (0.y.z), it is disheartening to login to monitor flows to see that everything has failed in the last 24 hours because I haven’t updated updated the flows to the newest release. Specifically 0.14.15 introduced things like •
    terminal_state_handlers
    on Flow objects that will cause a flow to fail if they aren’t present even though the 0.14.15 version defaults to None. •
    log_output
    on the
    ExecuteNotebook
    class that defaults to False on the 0.14.15 version, but the execution fails if a flow on a previous version is being run with the attribute not specified
    c
    m
    28 replies · 3 participants
  • k

    Kevin Kho

    04/09/2021, 5:28 PM
    Hello everyone! Just wanted to announce that we now have a #random channel for those who want to chat about other stuff.
    👍 4
    j
    1 reply · 2 participants
  • s

    Sean Harkins

    04/09/2021, 5:33 PM
    We would like to utilize Dask worker plugins (specifically
    PipInstall
    https://distributed.dask.org/en/latest/plugins.html#distributed.diagnostics.plugin.PipInstall to install dependencies on our workers at worker creation time). We are using a temporary cluster model as described here https://docs.prefect.io/orchestration/flow_config/executors.html#using-a-temporary-cluster. Is it possible to somehow obtain a reference the `DaskExecutor`’s distributed
    client
    https://github.com/PrefectHQ/prefect/blob/master/src/prefect/executors/dask.py#L209 from within our Flow code (or Task code) so that we can call
    register_worker_plugin
    ? I’ve browsed through the codebase but I’m not sure where in the Flow execution process that the `DaskExecutor`’s
    client
    is yielded and used?
    k
    6 replies · 2 participants
  • t

    Trevor Kramer

    04/09/2021, 8:50 PM
    I'm getting throttling exceptions when trying to submit a batch of jobs to run. What are the limits I need to stay within? An error occurred (ThrottlingException) when calling the RunTask operation (reached max retries: 2): Rate exceeded.
    k
    m
    4 replies · 3 participants
  • k

    KIRYL BUCHA

    04/10/2021, 11:51 AM
    Hi to all, I'm trying to use Prefect Jobs as an orchestrator for AWS GLUE Jobs. Does anyone try to implement start of aws glue jobs & monitoring the execution?
    k
    d
    6 replies · 3 participants
  • t

    Trevor Kramer

    04/10/2021, 1:42 PM
    Can concurrency tagging be used with resource_managers? We are creating an RDS instance as a @resource_manager and due to AWS limits there can only be 10 running at a time. The resource is used across several tasks so I don't think tagging the tasks would work. Is there another way to limit concurrency to avoid creating more than 10 of the resources protected by @resource_manager at a time?
    k
    1 reply · 2 participants
  • s

    Slackbot

    04/10/2021, 5:44 PM
    This message was deleted.
    k
    1 reply · 2 participants
  • s

    Sean Harkins

    04/11/2021, 11:17 PM
    We have a somewhat edge case where an external library is generating Prefect Flows https://github.com/pangeo-data/rechunker/blob/master/rechunker/executors/prefect.py#L42 that we register. Prior to registration we need to apply a decorator to each of the Flow’s tasks (in this case to register Dask
    WorkerPlugins
    ). To test this initially I applied our decorator to a test
    task
    def register_plugin(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            client = distributed.get_client()
            plugin = PipInstall(packages=["xarray"])
            client.register_worker_plugin(plugin)
            result = func(*args, **kwargs)
            return result
        return wrapper
    
    
    @task
    @register_plugin
    def say_hello():
        logger = prefect.context.get("logger")
        <http://logger.info|logger.info>("Hello, Cloud")
        return "hello result"
    This works as expected and the Dask worker logs show the plugin use.
    [2021-04-11 23:11:49+0000] INFO - prefect.CloudTaskRunner | Task 'say_hello': Starting task run...
    distributed.worker - INFO - Starting Worker plugin pip
    distributed.diagnostics.plugin - INFO - Pip installing the following packages: ['xarray']
    [2021-04-11 23:11:50+0000] INFO - prefect.say_hello | Hello, Cloud
    [2021-04-11 23:11:50+0000] INFO - prefect.CloudTaskRunner | Task 'say_hello': Finished task run for task with final state: 'Success'
    However, we don’t have access to the underlying tasks so instead we need to access them from an existing flow, wrap them and replace them within the flow prior to flow registration with
    for flow_task in flow.tasks:
        wrapped_task = register_plugin(flow_task)
        flow.replace(flow_task, wrapped_task)
    But using this approach seems to alter the task’s execution as the worker logs do not report the plugin use, or the Prefect
    logger
    statements.
    distributed.core - INFO - Starting established connection
    [2021-04-11 22:43:26+0000] INFO - prefect.CloudTaskRunner | Task 'Constant[function]': Starting task run...
    [2021-04-11 22:43:27+0000] INFO - prefect.CloudTaskRunner | Task 'Constant[function]': Finished task run for task with final state: 'Success'
    Am I missing something obvious in my task wrapping approach and replacement in the flow? Is there a better approach for accomplishing this?
    m
    13 replies · 2 participants
  • s

    Svyat

    04/12/2021, 12:02 AM
    Got a general question about configuring a flow to execute on AWS ECS EC2 instance instead of ECS Fargate. Followed this guide to deploy everything using the latter but have discovered I need more resources than what’s allowed for Fargate task definitions. Is there an example/documentation for creating tasks that will execute on a custom EC2 instance? Thanks much!
    k
    5 replies · 2 participants
  • r

    Rob Fowler

    04/12/2021, 3:49 AM
    has anyone else used LocalDaskExecutor in k8s? It seems the dask parent management process does not actually work under k8s. I'm getting all sorts of wacky errors. If it is not supported, it might be an anti-pattern, parallel processes inside k8s for prefect?
    m
    3 replies · 2 participants
  • r

    Rob Fowler

    04/12/2021, 9:49 AM
    docker run -it prefecthq/prefect:latest-python3.6  /bin/bash
    root@c281a603b87e:/# prefect version
    0.14.12
    g
    3 replies · 2 participants
Powered by Linen
Title
r

Rob Fowler

04/12/2021, 9:49 AM
docker run -it prefecthq/prefect:latest-python3.6  /bin/bash
root@c281a603b87e:/# prefect version
0.14.12
g

Greg Roche

04/12/2021, 10:09 AM
(.venv) C:\Users\groche\source> docker run -it prefecthq/prefect:latest-python3.6  /bin/bash
Unable to find image 'prefecthq/prefect:latest-python3.6' locally
latest-python3.6: Pulling from prefecthq/prefect
75646c2fb410: Already exists
62342603b9a2: Already exists
0bdd7747fb18: Pull complete
c8ecc0b9e8c5: Pull complete
e51cfb6f5145: Pull complete
e813c6a2bec6: Pull complete
56280ee759f3: Pull complete
Digest: sha256:b663f215c0dccedf2ee432ebaafa8d636e4aa424fa6e35e42c516385fa6718d6
Status: Downloaded newer image for prefecthq/prefect:latest-python3.6
root@2838bac2f4a5:/# prefect version
0.14.15
Looks like your docker image is outdated.
✅ 2
r

Rob Fowler

04/12/2021, 11:14 AM
aha, thanks probably, I never thought I pulled it at all, as I built it.
Can't live without the 0.14.15 goodness, even on ye olde redhat cluster
View count: 1