https://prefect.io
Join Slack
Hey folks! Does anyone have an example repo they could share that's handling the deployment of a Pre...
c

ciaran

about 4 years ago
Hey folks! Does anyone have an example repo they could share that's handling the deployment of a Prefect Agent with AKS? Currently starting off with spinning up a cluster with Terraform, but my k8s skills are sub-par so some useful starters would be handy. For reference i've CDK'd up a Prefect Agent & respective cluster in ECS on AWS before, but that wasn't involving k8s.
c
t
k
  • 3
  • 248
  • 139
<@ULVA73B9P> how to disable task caching system
s

Sergei

7 months ago
@Marvin how to disable task caching system
s
m
  • 2
  • 11
  • 138
Morning! I have an issue with the latest prefect release (2.16.0), the version 2.15.0 works fine. ...
w

William Jamir

over 1 year ago
Morning! I have an issue with the latest prefect release (2.16.0), the version 2.15.0 works fine. I have this minimal workable code async code.
@flow
async def myflow(experiment_id: int):
    engine = create_async_engine(
        url=get_db_url(),
        connect_args={
            'statement_cache_size': 0,
            'prepared_statement_cache_size': 0,
        },
    )

    async with AsyncSession(engine) as session:
        await session.get(Experiment, experiment_id)

if __name__ == '__main__':

    loop = asyncio.run(myflow(189))
But if I use a task, I got the following error:
RuntimeError: Task <Task pending name='Task-27' coro=<AsyncSession.close() running at /Users/william/.pyenv/versions/myenv/lib/python3.11/site-packages/sqlalchemy/ext/asyncio/session.py:1016> cb=[shield.<locals>._inner_done_callback() at /Users/william/.pyenv/versions/3.11.5/lib/python3.11/asyncio/tasks.py:881]> got Future <Future pending cb=[Protocol._on_waiter_completed()]> attached to a different loop
Full stracktrace on thread. Should I open a github ticket?
w
c
c
  • 3
  • 7
  • 138
Hello friends, Is it possible to pause a prefect workflow, have it not consume resources, then turn...
p

Patrick Barker

almost 2 years ago
Hello friends, Is it possible to pause a prefect workflow, have it not consume resources, then turn it back on later?
p
j
w
  • 3
  • 10
  • 138
Hey all! I've been playing around with setting up a Kubernetes agent in Google Kubernetes Engine whi...
e

Egil Bugge

about 3 years ago
Hey all! I've been playing around with setting up a Kubernetes agent in Google Kubernetes Engine which can spin up a ephemeral Dask cluster on demand. This all seems to work rather smoothly (thanks to the amazing work done by the Prefect team and others), but I'm having some issue getting the autoscaler to remove the nodes after the flow has run. I get the following error messages on my Kubernetes cluster after my flow has run: "Pod is blocking scale down because it’s not backed by a controller" "Pod is blocking scale down because it doesn’t have enough Pod Disruption Budget (PDB)" I'm pretty inexperienced with Kubernetes so I was wondering if anyone has any pointers to how I might configure the KubeCluster so that it works with autoscaling? We're thinking of using the cluster to hyperparameter tune a model. We do not use Kubernetes for anything else and have no need for the resources in between training runs so getting the node pool to autoscale down to zero (the agent will stay in a different node pool) would save us some money. My run code is below:
e
a
  • 2
  • 3
  • 138
When attempting to use `DbtCoreOperation` we are receiving the following error... This suggests that...
s

Stephen Lloyd

about 2 years ago
When attempting to use
DbtCoreOperation
we are receiving the following error... This suggests that our dbt cli profile block, which we have defined in dbt Cloud, does not have a name or target or target_configs, but it does..
pydantic.error_wrappers.ValidationError: 3 validation errors for DbtCoreOperation
dbt_cli_profile -> name
  field required (type=value_error.missing)
dbt_cli_profile -> target
  field required (type=value_error.missing)
dbt_cli_profile -> target_configs
  field required (type=value_error.missing)
I was able to use the
profile_dir
parameter to use my local credentials, but this only verifies there aren't obvious problems with the flow code.
prefect-dbt==0.3.1
prefect-snowflake==0.26.0
Any ideas?
s
r
s
  • 3
  • 7
  • 137
What’s the best practice for Data Retention Policy on Prefect deployment runs? Just as a reference,...
o

Ofir

about 2 years ago
What’s the best practice for Data Retention Policy on Prefect deployment runs? Just as a reference, here is how it is implemented for Apache Airflow, as yet another garbage collector DAG: https://stackoverflow.com/questions/66580751/configure-logging-retention-policy-for-apache-airflow I’m sure that Prefect has either a built-in mechanism for that, or encourages a common idiom for rotating / archiving / deleting artifacts from old runs. Context: We have a persistent storage on Azure Blob Storage (the S3 equivalent) where we store artifacts (e.g. output files and images) from a Machine Learning (Kedro) run. The space can pile up pretty quickly across runs and we would run out of storage, rendering our Prefect deployments not operational. What kind of policies are recommended to evict data from old runs? I don’t want to run out of space and I want the Prefect pipelines to remain operational. I know that some of you would say: “_It depends_”, so for the sake of this example let’s imagine that I have a dedicated 256GB of storage. Should I set a threshold (e.g. 70% of full) that will be as a trigger for evicting (removing) artifacts from old runs? Also, when should this run? as the first (prerequisite) subflow in my bigger flow, or as yet another deployment in Prefect on a recurring schedule? Thanks!
✅ 1
o
n
  • 2
  • 3
  • 136
I’m wondering if and how prefect could be used to transfer large amounts of data between different s...
m

Matias

over 5 years ago
I’m wondering if and how prefect could be used to transfer large amounts of data between different servers/clouds. Basically, I’d need to move 10-100 gigabyte csv/Jason files from an SFTP server to ADLS, and later on between other sources and sinks. Moving this amount of data as a one gigantic in memory string between tasks does not seem very sound approach for many reasons. So how would you actually do that?
m
j
+3
  • 5
  • 18
  • 136
Hi, all of our flow runs have been failing for the last 12 hours with the error ```Submission failed...
m

Mark NS

11 months ago
Hi, all of our flow runs have been failing for the last 12 hours with the error
Submission failed. RuntimeError: Cannot put items in a stopped service instance.
Does anyone know what might be causing this? cc @Marvin
m
m
  • 2
  • 3
  • 135
Hello Prefect community, how can I retry a task on specific error codes (500) with the same inputs b...
n

Nicolas Ouporov

over 1 year ago
Hello Prefect community, how can I retry a task on specific error codes (500) with the same inputs but using conditional logic based on the number of retries. So if the previous error code was 500 and the current retry count is 2, follow this particular path, otherwise follow the standard path? Our OCR pipeline has a long running task that often returns 500 if the file cannot be read. When we receive this error, we want to do postprocessing on the pdf (converting to an image) and rerun the task.
n
n
  • 2
  • 22
  • 135
Previous333435Next

Prefect Community

Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Powered by