Jan Rouš
01/14/2021, 3:27 AMLocalExecutor
the implicit in-memory caching results in the ETL eventually getting oom-killed 😕 I'm using checkpointing and some of the tasks explicitly write their results to disk where they are picked up by subsequent stages so the in-memory caching is not strictly necessary. I have not really found a way how I could tell prefect to not bother with this or at least, not eat all available memory while doing so. Any pointers for how I could solve this problems would be really helpful!Sasha Mikhailov
01/14/2021, 10:12 AM@task(log_stdout=True)
def print_context():
print('print context from task:', prefect.context.get('db_env'))
DB_ENV = Parameter("database to connect to", default='local')
prefect.context['db_env'] = DB_ENV # ← doesn't work in cloud
with Flow('testing_context') as flow:
print_context(upstream_tasks = [DB_ENV])
flow.run()
then I register the flow in a docker storage and push it to the cloud.
And it doesn't work when I run the flow via cloud ui: it shows None instead of parameter.Rolf Schick
01/14/2021, 12:55 PMFailed to load and execute Flow's environment: UnpicklingError("invalid load key, '{'.")
Dolor Oculus
01/14/2021, 2:21 PMLinnea Sahlberg
01/14/2021, 2:51 PMCharles Leung
01/14/2021, 3:08 PMMitchell Bregman
01/14/2021, 4:19 PMx
is a string and I want to .lower()
it.
flow = Flow("some flow")
with flow:
x = Parameter("x")
y = Parameter("y", default=x.foo())
The use case I will be doing this for is date specification and building dynamic dates based on a main driving parameterJosh
01/14/2021, 4:39 PM1. List S3 files
2. Filter to S3 files I want to copy
3. Download S3 files
4. Upload to GCP
I’m wondering if I can make a compound task that can parameterize the following steps. Or is the prefect idiom to make a dependent flow?
Part of the motivation is laziness, but if we ever change where/how we are starting our data ingestion, it’s be great not to have to change it in multiple placesKyle Flanagan
01/14/2021, 5:21 PMRaphaël Riel
01/14/2021, 5:33 PMjcozar
01/14/2021, 5:53 PMJoseph
01/14/2021, 7:41 PMJeremy Phelps
01/14/2021, 8:18 PMdelete_flow_run
GraphQL mutation, thinking that without the flow run's DB entry, Prefect would stop trying to execute it.
That removed the flow run from the UI, but if I delete the associated Kubernetes pod, Prefect still recreates it.
I found no subcommand in the prefect
CLI tool that can solve this problem, either.
Is there any way to stop Prefect from doing this?Aiden Price
01/14/2021, 10:59 PMCould not serialize object of type Success
problem that I've seen a few times on this Slack. From my reading of https://github.com/PrefectHQ/prefect/issues/3625 I suspect it's because I'm using resouce_manager
classes to hold database connection and HTTP connection pools while I map over everything in my Dask cluster. Is it not possible to share a connection pool across a Dask mapped pipeline (thinking about it now, it would be difficult)? If not how should I go about mapped reading & writing? Thanks.Marwan Sarieddine
01/15/2021, 12:23 AMmatta
01/15/2021, 1:06 AMflatten
at some point? Like if you pull a bunch of IDs and it takes the form of a list of lists, and then use those as the basis for a bunch of other calls?Sonny
01/15/2021, 3:11 AMJoël Luijmes
01/15/2021, 10:33 AMVipul
01/15/2021, 2:10 PMRaphaël Riel
01/15/2021, 2:29 PMJ. Martins
01/15/2021, 3:35 PMliren zhang
01/15/2021, 4:00 PMMatthew Blau
01/15/2021, 4:17 PMNate Lambeth
01/15/2021, 5:20 PMLoic M
01/15/2021, 5:36 PMgoogle.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see <https://cloud.google.com/docs/authentication/getting-started|https://cloud.google.com/docs/authentication/getting->started
Is there something I am missing there ? Should I include the google credentials when building my flow's docker image ?Josh
01/15/2021, 5:38 PMHui Zheng
01/15/2021, 5:44 PMPrefect cli
?Riley Hun
01/15/2021, 6:13 PMJoseph
01/15/2021, 8:38 PMLucas Kjaero-Zhang
01/15/2021, 8:40 PMagent | Service token file does not exists. Using out of cluster configuration option.
I’ve confirmed that the service account, role, and. rolebinding all exist on the server. Here’s a screenshot of the pod settings, attaching the rest in a thread