• k

    Ketan Bhikadiya

    2 years ago
    Hi, I have prefect server and agent running on the system. Even after registering the flow(flow.register()). I do not see any info on UI(I am not using prefect cloud). Also, I do not see config.taml file in my home directory of prefect. Not sure what is the issue here.
    k
    j
    +1
    5 replies
    Copy to Clipboard
  • m

    Manuel Mourato

    2 years ago
    Hello again I am trying to perform some multiprocessing tests with a DaskExecutor. This is how I set the remote env for multiprocessing:
    env = RemoteEnvironment(
        executor="prefect.engine.executors.DaskExecutor",
        executor_kwargs={
            "local_processes": True
        }
    
    )
    The execution DAG is in the annexed image. My understing is that , using multiprocesing, Task 2 and Task 4 would execute at the same time after Task 1, but in different processes, because they do not depend on each other. But the behaviour I see, is them executing sequentially, like wait() is being called between each process. Is my understanding not correct?
    m
    j
    10 replies
    Copy to Clipboard
  • z

    Zviri

    2 years ago
    hey everyone, I noticed very high memory consumption when using mapped tasks in conjunction with the 
    CloudTaskRunner
     but not the plain 
    TaskRunner
     (using Dask Deployment). I was observing that during the "mapping" procedure the worker that was doing the actual mapping was continuously using more and more memory. Which seemed reasonable since mapping constitutes copying the mapped task. However, I noticed that when using the
    CloudTaskRunner
    memory consumption is much much higher during this step. To be specific, mapping from a list that only contained approximately 8000 elements has eaten up more than 4 GB of memory on the worker. I did some debugging and found out that the same mapped task has a serialized size of 15 200 bytes using
    TaskRunner
    , but 122 648 bytes using the
    CloudTaskRunner
    . This is almost a 10 fold increase which makes the mapping function pretty unusable for me. The increased size is ultimately coming from pickling this function: https://github.com/PrefectHQ/prefect/blob/master/src/prefect/engine/task_runner.py#L788 and I think the serialized size of the
    CloudTaskRunner
    class is the cause of the different sizes. Is this behavior something that is known? Or is it worth a bug report? I will stick to the plain
    TaskRunner
    for now which will, unfortunately, prevent me from using the cloud UI which I really like. It would be great if this could be fixed. I'm using the latest prefect (v 0.10.7)
    z
    j
    +2
    5 replies
    Copy to Clipboard
  • a

    Adrien Boutreau

    2 years ago
    Hello! quick question : we are running job in a docker image, and we will like to automatize it with Prefect : do you know where I can find a tutorial ?
    a
    j
    2 replies
    Copy to Clipboard
  • m

    Matthias

    2 years ago
    Hi! How do I programmatically run a registered flow, so that I can see the execution in the UI (local, not cloud)?
    m
    Laura Lorenz (she/her)
    3 replies
    Copy to Clipboard
  • m

    Matthias

    2 years ago
    How does the Prefect agent get notified of new flows? I am running the agent in a Docker container and it is active and connected to apollo. Also I mapped the .prefect directory as a volume. But the Agent does not pick up new flows (when clicking run in the UI). When I run
    prefect agent start
    in my main Docker container the process picks up the flows
    m
    nicholas
    25 replies
    Copy to Clipboard
  • d

    David Ojeda

    2 years ago
    Hello there, I don’t know exactly how to search for this particular strange problem that I have encountered lately: I have a flow that is a simple pattern of generate N elements, map over them, then collect all results and send a slack notification. I executing this flow using a
    DaskKubernetesEnvironment
    and a kubernetes agent on a k8s GCP cluster that has auto-scaling (I am not sure if this is relevant). The strange situation is that, sometimes, the flow stays on running indefinetely because the last 1 to 3 tasks of the map are “still running”. I will try to collect some logs when this happens, but it does not look like the workers are stuck; it seems more like they were killed a bit too soon and they could not inform the prefect server (or the dask scheduler?) that they finished… Has anyone encountered this strange situation?
    d
    Laura Lorenz (she/her)
    11 replies
    Copy to Clipboard
  • Matias Godoy

    Matias Godoy

    2 years ago
    Hello! I'm using GraphQL to run flows. The problem is that the API requires the flow_id, which changes every time I register a new version of said flow. Is there a way to run a flow by name instead of by id using GraphQL?
    Matias Godoy
    a
    +1
    3 replies
    Copy to Clipboard
  • m

    Manuel Mourato

    2 years ago
    Hello again! Quick question: I have a flow1 with 3 sequential tasks,
    task_sequence = [load_data, task1, task2]
    test_flow1.chain(*task_sequence)
    which I saved to a file locally, and then loaded it via the
    test_flow2=Flow.load(path)
    method. Now, I want to add a new task3 to this flow, but I want to make load_data an upstream dependency of this new task, like this:
    test_flow2.set_dependencies(
        task=task3
        upstream_tasks=[load_data])
    But I get the error:
    A task with the slug "Task1" already exists in this flow
    It seems to complain about load data already being defined in the flow, which it is. But what I want is to say load_data is a dependency of task3 What am I doing wrong?
    m
    Laura Lorenz (she/her)
    8 replies
    Copy to Clipboard