https://prefect.io logo
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • p

    Pranit

    09/01/2022, 12:12 PM
    I had done th edeployment and initital execution of jobs was successful. But no job was executed as per schedule of hourly etc. I initiated the work queue and its giving me below error Anybody can help?
    ✅ 1
    r
    • 2
    • 7
  • o

    Oscar Björhn

    09/01/2022, 12:15 PM
    I'm having some issues embedding my flow in a docker image using 2.3.0. I've been looking at the source code for a while but I'm not sure why it's not being embedded. This is what I'm running: prefect deployment build orchestration/flows/test_curated.py:default -n "Test Curated (dev)" -q "transformation-dev" -t dev -t transformation --infra docker-container --override image=my:image -o test_curated-deployment.yaml
    ✅ 1
    d
    • 2
    • 3
  • s

    Slackbot

    09/01/2022, 12:32 PM
    This message was deleted.
  • d

    David Hlavaty

    09/01/2022, 12:33 PM
    Hi, great stuff in adding support for flows inside Docker image. It's much appreciate feature. Thanks Now a bit of feedback - my docker images are based on
    python:3-9slim
    and I then install all my dependencies including Prefect from a lock file. I prefer this over using the official Prefect images as it means there is a single source of truth for which version of Prefect I am using. I had my image "misconfigured" by setting working directory to
    /opt/prefect/flows
    . This then resulted in a cryptic message from
    shutil
    when running my flows as it tried to copy files from the flow source directory (
    /opt/prefect/flows
    by default) to a working directory (
    /opt/prefect/flows
    ) which are the same. Would be good if Prefect detected this and did not attempt to copy the files. Or at least checked if the working directory is the same and failed with helpful error message.
    a
    • 2
    • 4
  • d

    Dennis Hinnenkamp

    09/01/2022, 1:53 PM
    Hi there, is there a possibility to see the logging messages from a task, executed by subflow in the prefect ui? e.g. is used a flow to orchestrate two flows which triggers airbyte connections in parallel and after that it triggers one dbt flow. Actually I can't see the logging from the underlaying tasts in mein prefect ui. thanks in advancd
    ✅ 1
    j
    • 2
    • 2
  • v

    Venkat Ramakrishnan

    09/01/2022, 1:59 PM
    message has been deleted
    ✅ 1
    j
    • 2
    • 1
  • v

    Vlad Tudor

    09/01/2022, 2:09 PM
    Please help a person in despair Hello, I'm trying to setup Prefect inside a
    docker-compose
    that runs on a remote VM. However, I cannot configure the correct URL for the
    graphql
    (when opening the UI, I get the error
    Couldn't connect to Prefect Server at <http://localhost:4200/graphql>
    ) I tried to configure it from the
    config.toml
    file with the URL of my machine:
    [server]
      [server.ui]
        graphql_url = "http://<<MACHINE_PUBLIC_IP>>:4200/graphql"
    but it still tries the access localhost. Any help will be thoroughly appreciated. It's been 12 hours..
    ✅ 1
    1️⃣ 1
    r
    j
    • 3
    • 31
  • p

    Pranit

    09/01/2022, 2:09 PM
    I got https://docs.prefect.io/api-ref/prefect/exceptions/#prefect.exceptions.ScriptError This error while running a flow through cloud UI. When I run the script using simple python flow it doesn't fail, but while running via deployment gives Script error. Prefect as documented the error but not given what might be the probable solution
    ✅ 1
    j
    • 2
    • 14
  • y

    Youssef Ben Farhat

    09/01/2022, 2:42 PM
    Hello guys, I have few questions to ask: 1- How can I see the schematic for example with the Orion ? 2- Whe I want to see the dashboards, How Can I to change the Orion ip address to myserveraddress/4200 and not http://127.0.0.1:4200/? thank you so much
    ✅ 1
    a
    • 2
    • 1
  • a

    Alexander Kloumann

    09/01/2022, 2:53 PM
    Hi all, I have a general question about Prefect 1.0 vs 2.0. Is Prefect 1.0 not going to be supported much longer? I've been working with others on a project that hasn't been deployed yet and uses Prefect 1.0, and I'm wondering if we should not be spending much effort trying to implement it using 1.0. For instance, I was interested in using control flow "case" in 1.0 but am thinking this might conflict with the new case switcher in Python 3.10, and I don't see any use of "case" in 2.0. I realize this question is kind of vague, and any advice would be appreciated!
    ✅ 1
    a
    j
    • 3
    • 4
  • c

    Clint M

    09/01/2022, 2:53 PM
    Hi… I have an old pdf with the diff between prefect server vs prefect cloud for v1.0 is there an updated version for prefect 2.0?
    ✅ 1
    a
    j
    • 3
    • 8
  • s

    Sam Garvis

    09/01/2022, 3:35 PM
    Idk if this is realistic, but I have my current Git branch shown in terminal. With 2.0 being so terminal based, having a little shortcut to put in .zshrc... that shows the current prefect profile you're using would be HUGE. Would likely reduce errors longterm when accidentally working in the wrong workspace.
    ✅ 1
    a
    j
    • 3
    • 5
  • b

    Blake Stefansen

    09/01/2022, 4:41 PM
    Hi Everyone, What is the right format for passing in parameters to my deployments using the python library? I can't seem to get my parameters to show up in orion Version: 2.3.0
    ✅ 1
    r
    • 2
    • 5
  • a

    Adam Brusselback

    09/01/2022, 5:17 PM
    Hmm... in some cases prefect works fine with my domain name as the PREFECT_API_URL and in other cases it doesn't work...quite confused.
    ✅ 1
    j
    • 2
    • 6
  • v

    Venkat Ramakrishnan

    09/01/2022, 5:31 PM
    Does anyone know of a bug where even though the frequency of the schedule is once in an hour, multiple runs are created for that hour? The flows keep increasing every minute. I now have 5 runs for the same hour! Here is my deployment code:
    deployment = Deployment.build_from_flow(flow=print_pipeline_hour,\
                                            name="Hourly Pipeline Deployment", version="1", tags=["Iris"],\
                                            schedule={'rrule': 'FREQ=HOURLY;UNTIL=20220912T040000Z', 'timezone': "Asia/Kolkata"},
                                            work_queue_name="hour-work-queue")
    ✅ 1
    j
    m
    • 3
    • 9
  • h

    Henning Holgersen

    09/01/2022, 5:52 PM
    Has anyone used Prefect together with Meltano? I am able to get it to work locally with an agent and a shell task, but I have not been able to build a docker image containing both the meltano project and the prefect agent - although Meltano has built-in functionality for containerisation.
    👀 1
    ✅ 1
    r
    • 2
    • 5
  • k

    Krishnan Chandra

    09/01/2022, 6:34 PM
    Hey folks! I’m wondering what would be a good pattern to set up long-running Prefect agents in Prefect 2.0. It looks like the
    DockerContainer
    and
    KubernetesJob
    create new containers/jobs per flow run, but what I’d ideally like to do is run a fleet of long-running workers that process from the queue. What’s the best infrastructure to choose in Prefect 2.0 to achieve this?
    m
    • 2
    • 6
  • k

    Kevin Grismore

    09/01/2022, 6:36 PM
    is there anything preventing the GitHub block from working with non-GitHub repos, like GitLab?
    ✅ 1
    n
    • 2
    • 2
  • p

    Parwez Noori

    09/01/2022, 6:43 PM
    Hi everyone! We are using Prefect 2.0 with Azure. Using the traditional df.to_sql is not fast for ingesting data into our Azure SQL DB. Timetable with 50.000 rows: local dask cluster: to_sql ~ 6 min. local dask cluster: to_sql with fast_executemany ~ 4 min data factory ~ 15 sec. What would you do to increase ingestion speed? Would you recommend using real Dask Executor? Any help or ideas is appreciated!
    ✅ 1
    h
    r
    • 3
    • 8
  • k

    kiran

    09/01/2022, 9:56 PM
    Hi all. In Prefect 2, is the default task runner the
    Concurrent
    or
    Sequential
    ? In one section, the docs say the default is
    Concurrent
    but then in another section, they say “Make sure you use
    .submit()
    to run your task with a task runner. Calling the task directly, without
    .submit()
    , from within a flow will run the task sequentially instead of using a specified task runner.”
    which seems to imply that the default is actually
    Sequential
    ✅ 1
    k
    t
    • 3
    • 9
  • b

    Blake Hamm

    09/01/2022, 11:23 PM
    Prefect 2.3 is exciting for sure! The speed of development is wild and I'm trying to keep up! I'm wondering if
    KubernetesJob
    blocks can interact with (the new)
    DockerContainer
    block. Is there a way to deploy with
    --ib
    as a
    DockerContainer
    as well as use the
    KubernetesJob
    ? Or is there any roadmap to connect the two together (like how the
    DockerContainer
    block can access a
    DockerRegistry
    block? I imagine this isn't feasible, but generally, I really like the ability to pass in a manifest file to a
    KubernetesJob
    block. Specifically, I'm using EKS on AWS Fargate and really like the ability to define the resources. On the other hand the new
    DockerContainer
    block seems really handy to manage environments for specific flows. Right now I have one image on ECR with all the dependencies and it's much heavier than it needs to be. From a CI/CD perspective it would be great to have an action creating
    DockerContainer
    blocks based on the individual flows and another action creating
    KubernetesJob
    blocks based on different manifest files. In an ideal world, I would love a way to deploy using a
    DockerContainer
    block as the image inside the
    KubernetesJob
    block. A current (hacky) solution could be to loop through the flow-level docker files, register them to ECR based on their flow name and create all the necessary
    KubernetesJob
    blocks for each individual flow. I could use some kind of "resource" tag to pick the necessary manifest file. This would create a distinct
    KubernetesJob
    block for every flow even though I might only be using 3 distinct manifest files (just lot's of unique containers for each flow). This would also require an --sb (unlike the new standalone
    DockerContainer
    block.
    ✅ 1
    m
    • 2
    • 3
  • v

    Venkat Ramakrishnan

    09/02/2022, 4:27 AM
    Referring to https://docs.prefect.io/concepts/deployments/, and the description for blocks is not clear. There seems to be two types of blocks 'storage block' and 'infrastructure block', but I am not sure. The documentation is referring to 'block', but it is not clear about : 1. What is a storage block and what is it used for? 2. What is infrastructure block and what it is used for ? 3. When blocktype/blockname is specified, 'blockname' is referred to as 'local-file-system' while creating the block. But where is the block created? It is not shown in the document. 4. Is remote storage a substitute for -sb or mutually exclusive? Totally confused. If someone can explain in detail, it would be helpful. Docs are not helping.
    ✅ 1
    t
    • 2
    • 6
  • a

    Andreas Nord

    09/02/2022, 10:04 AM
    Hi! I followed the tutorial on https://github.com/anna-geller/prefect-docker-deployment. One step of the docker file is to copy the flows:
    ADD flows /opt/prefect/flows
    Which seems to be consistent with the deployment
    storage: null
    path: /opt/prefect/flows
    entrypoint: flows\healthcheck.py:healthcheck
    It seems that the deployment worked, but I can't find this path (/opt/prefect/flows) locally. I'm on Windows
    a
    • 2
    • 6
  • k

    Klemen Strojan

    09/02/2022, 12:04 PM
    Hey all - there has been a lot of discussion and a few great resources for deploying Prefect 2.0 on AKS. https://discourse.prefect.io/t/how-to-deploy-a-prefect-2-0-agent-to-an-azure-kubernetes-cluster-and-connect-to-azure-blob-storage/1128 Tutorial mentions another tutorial that will describe connecting Prefect Agent on AKS with Prefect Cloud instance. Is this already available?
    ✅ 1
    r
    p
    • 3
    • 3
  • r

    Ross Teach

    09/02/2022, 1:21 PM
    Using Prefect V2 Cloud, noticed a few of the following errors (8AM EST + 9AM EST). Seems to be intermittent. Is there a known issue?
    prefect.exceptions.PrefectHTTPStatusError: Server error '500 Internal Server Error' for url '<https://api.prefect.cloud/api/accounts/043b2649-9d07-4c5e-8225-521ba2275e68/workspaces/689b139b-a725-4c2b-b167-86a705b8789d/task_runs/>'
    Response: {'exception_message': 'Internal Server Error'}
    For more information check: <https://httpstatuses.com/500>
    ✅ 1
    j
    k
    • 3
    • 14
  • j

    José Duarte

    09/02/2022, 1:47 PM
    Hey yall, is there a complete guide on how to deploy Prefect open source or not really?
    👀 1
    ✅ 1
    r
    • 2
    • 2
  • s

    Seth Goodman

    09/02/2022, 2:25 PM
    Hi All - I am migrating from Prefect 1.0 to 2.0 and am specifically dealing with a mapped task parallelized using Dask. From the 2.0 docs it sounds like you need to call Task.submit() when using a task runner like Dask, but I am unclear how that is applied when using a map. Any guidance or an example would be appreciated. Thanks!
    ✅ 1
    k
    r
    • 3
    • 27
  • k

    kwmiebach

    09/02/2022, 2:46 PM
    Hello 🙂 I believe I ran into a threading problem yesterday. I am converting some data extraction pipelines to prefect 2 flows, which works fine for most of them. I just add the decorators and some logging. But one of the data pipelines uses sqlite for intermediate storage, and this is the error message I receive:
    sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 140193261803264 and this is thread id 140192735962880.
    I can also paste the part of the code where sqlite is called. I am also trying to guess the reason behind the error. There is an sqlite object created in a function. Within the function I define another function which uses this object. But both python functions are not prefect flows or tasks. They live within a bigger prefect flow. So here comes my first question: Does prefect 2 create a different thread for each nested function inside a flow or a task. Otherwise I cannot explain why the 2 parts of the code would run in different threads.
    ✅ 1
    k
    r
    • 3
    • 27
  • s

    Sam Garvis

    09/02/2022, 2:48 PM
    I'm on Prefect 2.3.0
    prefect deployment build flowname.flowname -n flowname_dev --work-queue=dev-wq-1 -ib kubernetes-job/dev-k8s-job -sb gcs/dev --override image_pull_policy=Always
    When I run this, the yaml created has this. Why does it still included the manifest.json? I thought that was deprecated
    ### DO NOT EDIT BELOW THIS LINE
    ###
    flow_name: Colorado Watcher
    manifest_path: colorado_watcher-manifest.json
    ✅ 1
    k
    a
    • 3
    • 11
  • j

    Josh Paulin

    09/02/2022, 4:55 PM
    I’m trying to understand why my flow keeps crashing partway thorugh, at a seemingly consistent spot. I have a parent flow that runs the first subflow fine, but half way through the second subflow the parent fails with the exception posted in the thread. The UI still shows the second subflow as running. This is on EKS by the way. Locally things look to run until completion.
    ✅ 1
    m
    • 2
    • 17
Powered by Linen
Title
j

Josh Paulin

09/02/2022, 4:55 PM
I’m trying to understand why my flow keeps crashing partway thorugh, at a seemingly consistent spot. I have a parent flow that runs the first subflow fine, but half way through the second subflow the parent fails with the exception posted in the thread. The UI still shows the second subflow as running. This is on EKS by the way. Locally things look to run until completion.
✅ 1
Encountered exception during execution:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 587, in orchestrate_flow_run
    result = await run_sync(flow_call)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 56, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "/usr/local/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "activity_processor/flow/parent.py", line 43, in api_activity_processor
    process_hourly_api(normalized_process_time, dry_run, validate_results)
  File "/usr/local/lib/python3.9/site-packages/prefect/flows.py", line 384, in __call__
    return enter_flow_run_engine_from_flow_call(
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 160, in enter_flow_run_engine_from_flow_call
    return run_async_from_worker_thread(begin_run)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 136, in run_async_from_worker_thread
    return anyio.from_thread.run(call)
  File "/usr/local/lib/python3.9/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.9/site-packages/prefect/client.py", line 104, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 435, in create_and_begin_subflow_run
    flow_run.state.data._cache_data(await _retrieve_result(flow_run.state))
  File "/usr/local/lib/python3.9/site-packages/prefect/results.py", line 38, in _retrieve_result
    serialized_result = await _retrieve_serialized_result(state.data)
  File "/usr/local/lib/python3.9/site-packages/prefect/client.py", line 104, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/results.py", line 34, in _retrieve_serialized_result
    return await filesystem.read_path(result.key)
  File "/usr/local/lib/python3.9/site-packages/prefect/filesystems.py", line 149, in read_path
    raise ValueError(f"Path {path} does not exist.")
ValueError: Path /root/.prefect/storage/4762715b441f4b3c8011b92dc1d5361f does not exist.
m

Michael Adkins

09/02/2022, 4:57 PM
It looks like your subflow is attempting to retrieve its state from prior run instead of creating a new run
At
create_and_begin_subflow_run
we can see that it attempts to do
flow_run.state.data._cache_data(await _retrieve_result(flow_run.state))
— this call is rehydrating the result on the state
That means that we’re in this case: https://github.com/PrefectHQ/prefect/blob/main/src/prefect/engine.py#L430-L442
tldr; it looks like the second subflow thinks it is the first one. We’ll need a minimal example to fix this.
j

Josh Paulin

09/02/2022, 6:26 PM
I think I see what might be causing some of this. Looks like the pod runs out of memory.
Is there any way to understand the memory profile of a flow? It jumps as high as 60 GB for something that should be pretty benign. 😕
m

Michael Adkins

09/02/2022, 8:55 PM
Is it possible that all of your tasks together are consuming that much memory? We do not release memory eagerly yet
j

Josh Paulin

09/02/2022, 9:02 PM
As in when a task finishes the memory isn’t released until the flow completes?
m

Michael Adkins

09/02/2022, 9:10 PM
For tasks that are submitted as futures, no. We can’t know if you’ll need the result of the future downstream.
We might also hold onto the data for normal task calls to as we track the state of all task runs created in a flow. With upcoming work on result handling, we can optimize that though.
j

Josh Paulin

09/02/2022, 9:51 PM
Things get better if I don’t submit any tasks as futures, but we’re still getting up to ~37GB. I also don’t see any real drop in memory between when each subflow finishes (blue lines are where a new subflow starts).
m

Michael Adkins

09/02/2022, 9:54 PM
Are you returning a value from your subflow? If not, we default to a state that bundles all the task states from within the subflow
j

Josh Paulin

09/02/2022, 9:56 PM
No returns, so sounds like I should?
m

Michael Adkins

09/02/2022, 9:57 PM
Yeah that should help since it'll release all those tasks on completion.
j

Josh Paulin

09/02/2022, 10:15 PM
That definitely works to clear out some resources between the subflows. The biggest hog definitely looks like the futures. Is there any upcoming work to optimize that?
m

Michael Adkins

09/02/2022, 10:59 PM
Yeah there is, I'll be working on result handling over the next month.
💚 1
View count: 3