Jeremiah
Darragh
05/01/2020, 10:20 AMZviri
05/01/2020, 12:09 PMflow.run
) the memory consumption of my Dask Workers gradually grows until they run out of memory. When I run the same flow from a script everything finishes correctly and there are no memory issues what so ever.
I am using the latest prefect
and dask
packages. Has anyone experienced something similar? Is there anything that could be done about it? Thanks a lot.Jacques
05/01/2020, 1:46 PMJacob (he/him)
05/01/2020, 4:00 PMJohn Ramirez
05/01/2020, 6:00 PMTroy Köhler
05/01/2020, 6:01 PMJohn Ramirez
05/01/2020, 6:19 PMtiz.io
05/01/2020, 6:49 PMCrawford Collins
05/01/2020, 7:18 PMwith Flow("test_target_transformer") as f:
te = fit_target_transformer(problem, target, train_data)
df = target_encoder_transform(te, imputed_categorical_df)
assert df.isna().sum() == 0
I'm trying to test a @task that returns a dataframe, but assert statement errors on AttributeError: 'FunctionTask' object has no attribute 'isna'
.
How do I return this as a DataFrame and not a task?Matt
05/02/2020, 7:22 AMRuben Stolk
05/02/2020, 10:00 AMDavid Haines
05/02/2020, 12:25 PMifelse
has been integrated within the imperative api yet?Florian
05/03/2020, 12:31 PMflow.run()
leads to a blocking process until the flow has finished. Is there a way of submitting jobs to the dask cluster in an almost fire and forget scenario? Especially since my outputs are written to file or DB I really don’t need the return/result object. I basically want to dynamically create workflows submit them to the dask cluster and then have them finish based on available resources.
Could be that I am overlooking something super simple or using prefect for something it is not build for 🤷♂️ but would be nice to know 🙂Pierre CORBEL
05/03/2020, 12:44 PMLast State Message
in the UI but not in the logs tab (see attached image n°2 and n°3)
For info, I'm using RemoteEnvironment with LocalDaskExecutor on Prefect 0.10.5Nate Atkins
05/04/2020, 12:37 AMChris O'Brien
05/04/2020, 3:56 AMrun_date
that is yesterday UTC time. I would love to use the prefect context to do this and this worked for me, but is there a better way?Thomas La Piana
05/04/2020, 6:41 AMAdisun Wheelock
05/04/2020, 1:16 PMprefect auth login -t my_generated_token
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: <http://localhost:4200/graphql/alpha>
has anyone else ran into this? The server start up brought up all services just fine.Avi A
05/04/2020, 2:45 PMKetan Bhikadiya
05/04/2020, 4:38 PMPedro Machado
05/04/2020, 6:39 PMexecution_date
.
I am trying to figure out how I could implement this in Prefect. It seems to me that I could have a single parametrized flow but I am not sure how the parameters would be provided and the flow scheduled. Thanks!tiz.io
05/05/2020, 12:38 AMBrad
05/05/2020, 2:14 AMPedro Machado
05/05/2020, 3:53 AMSainath
05/05/2020, 11:31 AMThomas La Piana
05/05/2020, 11:35 AMflow.run()
locally, it automatically defaults to the local executor instead of using the environment that i set for it, but when i register it and a kube runner picks it up it runs it with the correct environment. does anyone know why?Avi A
05/05/2020, 7:45 PMitay livni
05/05/2020, 10:06 PMNoCredentialsError('Unable to locate credentials')
error using Docker Storage (local image), prefect cloud and the s3ResultHandler
When the flow is run locally. There are no credential issues initializing the handler s3_handler = S3ResultHandler(bucket='some-bucket')
To resolve the issue I created a Prefect Secret called "AWS_CREDENTIALS"
and tried
s = Secret("AWS_CREDENTIALS")
aws_scrts = s.get()
s3_handler = S3ResultHandler(bucket='some-bucket', aws_credentials_secret=aws_scrts)
What is the best way to resolve aws credential error? ThanksDavid Ojeda
05/06/2020, 9:17 AMValueError: A task with the slug "limit" already exists in this flow.
I could come up with a plumbing hack like this:
flow = Flow(name)
local_flow = build_local_flow() # a function that returns a flow
quetzal_flow = build_quetzal_flow() # idem
# plumbing: both local and quetzal flows have a limit parameter
limit_parameter = local_flow.get_tasks(name='limit')[0]
other_parameter = quetzal_flow.get_tasks(name='limit')[0]
for edge in quetzal_flow.edges:
if edge.upstream_task == other_parameter:
edge.upstream_task = limit_parameter
quetzal_flow.tasks = set(t for t in quetzal_flow.tasks if t != other_parameter)
flow.update(local_flow)
flow.update(quetzal_flow)
...
which works, but it seems very hackish and far from elegant.
Is there a cleaner alternative to this? (other than renaming the parameter, of course)