prefect-community #prefect-community

I’m trying to sort out how to run prefect agents as background processes. Running an agent alone interactively works well (connects with work queue perfectly). But then I tried to run as a background process using supervisor via systemctl, and the agent is running, but it’s not connected to the work queue previously created - the work queue in the app.prefect.cloud UI is in an unhealthy state. Any troubleshooting tips?

✅ 1

Ilya Galperin

12/07/2022, 8:19 PM

Is there an easy to way to order or otherwise somehow identify tasks based on runtime i.e. if we want to further evaluate long-running tasks in a flow? The only sorting mechanism in the UI is by age it looks like.

Zachary Loertscher

12/07/2022, 9:17 PM

Hi all, I'm running into an issue where Prefect is not detecting new packages I add to the requirements.txt (specifically the prefect

sql_server

extra and

pyodbc

package). We are running Prefect 1.2.0. Deployment is successful, and the flow is successfully registered (the imports in my

.py

script run successfully), but Prefect Cloud can't find the package:

Failed to load and execute flow run: ImportError('Using prefect.tasks.sql_server requires Prefect to be installed with the "sql_server" extra.')

Is there a way to re-sync prefect-cloud with my docker container? Just seems like Prefect cloud isn't finding the packages I have installed on my container

Michael Cody

12/07/2022, 10:02 PM

I'm trying to update the logging formatter datefmt: I tried

prefect config set PREFECT_LOGGING_FORMATTERS_SIMPLE_DATEFMT="%Y-%m-%d %H:%M:%S"

which return

"Unknown setting name 'PREFECT_LOGGING_FORMATTERS_SIMPLE_DATEFMT'"

. If I try

prefect config set PREFECT_LOGGING_HANDLERS_CONSOLE_FORMATTER=json

from this this message from a month ago, I get the same error. This should be simple but I think I'm missing something. Editing the logging config works, but I'd rather have it in the profile instead of copying the logging.yml file. Thanks. https://prefect-community.slack.com/archives/CL09KU1K7/p1666798948425739?thread_ts=1666794116.134359&cid=CL09KU1K7

Prefect 2.0 Architecture Overview

Paige Gulley

12/07/2022, 10:39 PM

Hi! I'm in the process of trying to make an argument for prefect to a team- I'm wondering it the prefect developers have published an 'architecture overview' or anything, to help assess the technical costs of the tool vs other similar products like airflow

Mike Grabbe

12/07/2022, 11:31 PM

I've noticed an odd logging behavior in Prefect 2 where sometimes strings that contain brackets

aren't being logged correctly. Has anyone else noticed this? Details in 🧵

✅ 1

wonsun

12/08/2022, 4:51 AM

Hi all~ I'm trying to migrate to prefect2.0. But i don't know how connect the local to cloud. Following the description for how to start the prefect 2.0 from mail sended by Prefect Developer Experience, they said Create the API Key for login after creating workspace. But i couldn't see the API Keys menu at my UI. Is it because I'm alone in the workspace? Actually, What I really need is "*Configure a local execution environment to use Prefect Cloud.*" When i tried to start the orion server without API keys, IP address and default port are already in use. (attached the capture at this thead.)

radar 1

✅ 1

Mahesh

12/08/2022, 6:22 AM

Hi Team, How to get flow logs using command line in prefect2? In prefect1 we can use

prefect get

command to get logs.

✅ 1

Olivér Atanaszov

12/08/2022, 11:27 AM

Hi, what is the recommended way of adding a handle for flow run interruption in Prefect 1.0 (similarly to

on_handle

argument of

Flow

✅ 1

1️⃣ 1

Sunjay

12/08/2022, 2:17 PM

Hi Team, I created a custom block with some functionality an registered the block on the UI. I made a few changes to the block and registered this block again but I am still seeing old functionality. So is there a way to delete an existing block from the workspace or to overwrite an existing block.(custom block)

✅ 2

Kelvin DeCosta

12/08/2022, 2:28 PM

Hey everyone! Was wondering if there's a

flake8

(or any other linter) plugin to check for

prefect

specific code issues. The main issue I find myself making is calling a

task

from another one. I don't think a static type checker can detect this type of issue. There are probably more issues that are waiting to be discovered and I think it would benefit the community if there was a linting tool that accounted for

prefect

👀 2

Nathan R

12/08/2022, 2:37 PM

Hi All, can you pass deployment parameters to the create flow run API when you transition it to scheduled? http//<IP>4200/api/deployments/<deployment_id>create_flow_run body -> {"name":"curl","state":{"type":"SCHEDULED"}}

Babak

12/08/2022, 5:20 PM

Hi team, I had some deployment, Flow in my Prefect Cloud. I restart my laptop and they all gone. do you have any idea?

Ashley Felber

12/08/2022, 5:25 PM

Hello, I have created a deployment using an ECStask block. The ECStask block is configured to run an existing task definition. When I try to run the deployment I get the following error: Submission failed. KeyError: "No class found for dispatch key 'ecs-task' in registry for type 'Block'." This error shows up under details of the flow run but there are no other logs.

What is an ECS Task in Prefect

Chris Gunderson

12/08/2022, 5:51 PM

Hi Team - Is the ECS Task basically the agent listening to the work queue?

Kyle McChesney

12/08/2022, 5:59 PM

Is there any more in-depth documentation for the Lazarus retry in v1 besides: https://docs-v1.prefect.io/orchestration/concepts/services.html#lazarus. I am seeing some weird behavior where I have a flow run that got restarted, and according to the logs the entire thing re-ran, but according to the flow run visual display, it tried to pick up where it left off. It caused a number of downstream tasks to fail, instead of receiving their inputs from upstream tasks which ran before the restart, they seemingly got passed

None

Slackbot

12/08/2022, 6:22 PM

This message was deleted.

merlin

12/08/2022, 6:22 PM

Prefect profile: changing profile in one virtual environment is reflected in other virtual environments on the same machine. I'm confused about using profiles. Details in thread.

✅ 1

Sean Conroy

12/08/2022, 7:44 PM

Seeing the following

run_migrations

error using Prefect 2.7.0...full traceback in the reply. Anyone familiar with this?

Shruti Hande

12/09/2022, 9:13 AM

I am trying to executed concurrent tasks of flow from multiple agents with same work queue, but only single agent is picking the tasks. Is there a way to run these tasks on multiple agent with same queue? Using prefect - 2.6.4 #prefect-community #prefect-contributors

Clovis

12/09/2022, 10:11 AM

Hi everyone ! Is it possible to save/update only part of a

JSON

block ? I did see nothing like that in documentation. To give some context, I got multiple concurrent tasks depending on a common block and actualizing only different part of it. As I don’t want to lose data and because of the concurrency, I want to avoid loading the

JSON

block in each tasks, saving it just after with my updates and risking to overwrite it with missing value. So my question here is can I save only part of the

JSON

block ? Or maybe there is another solution (like lock or something) to prevent block data loss when it comes to async parallel task ?

Vadym Dytyniak

12/09/2022, 10:34 AM

Hi. After upgrading agent to 2.7.1 getting the error provided in thread.

Thomas Fredriksen

12/09/2022, 1:14 PM

Hello everyone. I am experimenting a bit with the dask taskrunner, and I am running into some issues related to priority. I have the following test-flow:

Copy code

from typing import List, Tuple

import dask
from prefect import flow, get_run_logger, task
from prefect.context import get_run_context
from prefect_dask import get_dask_client


def is_prime(number: int) -> Tuple[int, bool]:
    if number == 2 or number == 3:
        return number, True
    if number % 2 == 0 or number < 2:
        return number, False

    for i in range(3, int(number ** 0.5) + 1, 2):
        if number % i == 0:
            return number, False

    return number, True


@task
def get_primes_from_split(min_number, max_number) -> List[int]:

    if min_number % 2 == 0:
        min_number += 1

    with get_dask_client() as client:
        futures = [client.submit(is_prime, n) for n in range(min_number, max_number, 2)]

        maybe_primes = [future.result() for future in futures]

    return [value for value, flag in maybe_primes if flag]


@flow(name="example_prime_number_search")
def main(max_number: int = 1_000_000, split_size=10_000):
    log = get_run_logger()
    context = get_run_context()

    <http://log.info|log.info>("Task Runner: %s", context.task_runner.name)
    <http://log.info|log.info>("Searching for primes from up to %d", max_number)

    futures = [get_primes_from_split.submit(x, x + split_size) for x in range(0, max_number + 1, split_size)]
    primes = [value for future in futures for value in future.result()]

    if len(primes) > 10:
        <http://log.info|log.info>("Found %d primes: %s, ...", sorted(primes)[::-1][:10])
    else:
        <http://log.info|log.info>("Found %d primes: %s", sorted(primes)[::-1][:10]

When running this with the

DaskTaskrunner

, the task

get_primes_from_split

is scheduled first, then the dask-future

is_prime

. Since

get_primes_from_split

is scheduled first it gets higher priority, which causes the dask-execution to lock up, as it is waiting for the task to complete before executing anything else.

get_primes_from_split

naturally is waiting for

is_prime

to complete, which unfortunately will not execute at this point. Toying around with priorities, I managed to get

is_prime

to execute:

Copy code

from typing import List, Tuple

import dask
from prefect import flow, get_run_logger, task
from prefect.context import get_run_context
from prefect_dask import get_dask_client


def is_prime(number: int) -> Tuple[int, bool]:
    if number == 2 or number == 3:
        return number, True
    if number % 2 == 0 or number < 2:
        return number, False

    for i in range(3, int(number ** 0.5) + 1, 2):
        if number % i == 0:
            return number, False

    return number, True


@task
def get_primes_from_split(min_number, max_number) -> List[int]:

    if min_number % 2 == 0:
        min_number += 1

    with get_dask_client() as client:
        futures = [client.submit(is_prime, n, priority=100) for n in range(min_number, max_number, 2)]

        maybe_primes = [future.result() for future in futures]

    return [value for value, flag in maybe_primes if flag]


@flow(name="example_prime_number_search")
def main(max_number: int = 1_000_000, split_size=10_000):
    log = get_run_logger()
    context = get_run_context()

    <http://log.info|log.info>("Task Runner: %s", context.task_runner.name)
    <http://log.info|log.info>("Searching for primes from up to %d", max_number)

    with dask.annotate(priority=0):
        futures = [get_primes_from_split.submit(x, x + split_size) for x in range(0, max_number + 1, split_size)]
        primes = [value for future in futures for value in future.result()]

    if len(primes) > 10:
        <http://log.info|log.info>("Found %d primes: %s, ...", sorted(primes)[::-1][:10])
    else:
        <http://log.info|log.info>("Found %d primes: %s", sorted(primes)[::-1][:10])

This causes dask to schedule a few instances of

get_primes_from_split

, which in turn schedules all its instances of

is_prime

is_prime

executes properly and starts returning its results, but it doesn't seem like

get_primes_from_split

picks up execution. I really don't understand what is going on here. Can anyone provide some insight into how do this kind of execution without reaching a deadlock like above?

✅ 1

Jelle Vegter

12/09/2022, 2:42 PM

Hi all, I’m looking how to migrate over to Prefect 2.0 and am designing my deployment pattern. Is there an equivalent to the extract_flow_from_file function in Prefect 2.0?

Rio McMahon

12/09/2022, 2:44 PM

Within prefect 1.0 is it possible to have multiple slack webhook URLs for different slack channels? e.g. we want to delineate between data engineering and data science prefect alerts (different alerting channels for each). Looking at the docs: https://docs-v1.prefect.io/core/advanced_tutorials/slack-notifications.html#installation-instructions it seems like there is only one secret per account called

"SLACK_WEBHOOK_URL"

✅ 1

Kendall Bailey

12/09/2022, 4:00 PM

Question regarding resource management in Prefect 2.x when all tasks are async (i.e. “submitted”) and some tasks are optional, in 🧵

Chris Gunderson

12/09/2022, 4:40 PM

Hi all - Not sure why this error is occurring. I've have included s3fs, gcsfs, and adlfs in the Poetry toml file used to create the Docker image.

Denis Sh

12/09/2022, 5:24 PM

Hi all, how do you guys check if Block exists inside a Flow? I tried wrapping with

Copy code

storage_name = f"{self.account.name}-session"
try:
    self.session = JSON.load(storage_name)
except ValueError as e:
    self.logger.error(e)
    self.session = JSON(value={})
    self.session.save(name=storage_name)
    <http://self.logger.info|self.logger.info>(f"created session storage ({storage_name=})")

but flow keeps failing on this exception.. How to gracefully handle it? ADDED: seems the problem was capital letters in block name! otherwise code works as intended. fixed by modifying type in model definition for account.name to constr(to_lower=True)

✅ 1

Edmund Tian

12/09/2022, 5:58 PM

Could someone help me understand if a Prefect Flow supports my use case? I’m building something similar to Plaid. I’m planning on using a Prefect Flow to orchestrate my per-user data pipeline. Here’s a simplified version of what it looks like: 1. Extract the user’s bank transactions. This involves logging into their bank with their submitted credentials, scraping their bank transactions, and saving this raw data to bucket storage. 2. Transforming the bank transactions. This involves downloading the raw data from bucket storage, transforming the data into our standardized format, and saving this data to a postgres table. This Flow must be run on a per-user basis. There are two instances in which this Flow is ran: 1. When a user is first created 2. During a daily job to refresh all user’s data Thus, if I have 1M users, then I would need something that supports 1M concurrent Flow Runs. Is this achievable with Prefect?

Joshua Grant

12/09/2022, 6:04 PM

I'm having trouble registering my flows with S3. I saw on the prefect discourse that there is an undocumented

upload_options

argument to the

S3

block, which is great because I need to specify

ServerSideEncryption

, however, I'm receiving

AccessDenied

Copy code

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

✅ 1