Christopher Stokes
09/27/2019, 8:40 PMTobias Schmidt
09/28/2019, 1:57 PMJeff Yun
09/30/2019, 6:27 PMTypeError: generate_task() missing 2 required positional arguments:
2) Calling a known task from a "compose" task
--> got ValueError: Could not infer an active Flow context.
3) Typecasting parameter to list --> Parameter is not iterableJeff Yun
09/30/2019, 6:47 PMflattened_lists = itertools.chain(*nested_lists_param)
but can't iterate through the Parameter itself.Jeff Yun
09/30/2019, 11:15 PM--nprocs
--nthreads
on workers), it seems that consistently
- Running Client() locally starts immediately, as expected.
- Running with one worker server, starting takes much longer as N increases
For large N:
[2019-09-30 22:59:15,781] INFO - prefect.TaskRunner | Task 'stage_0': Starting task run...
distributed.utils_perf - INFO - full garbage collection released 561.93 MB from 0 reference cycles (threshold: 10.00 MB)
distributed.core - INFO - Event loop was unresponsive in Worker for 34.31s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
- Running on multiple servers, the Dask scheduler also takes much way longer than the expected ~1ms overhead/task, before any activity happens (on the client servers or the Dask scheduler dashboard).
Why is scheduling taking so long? How could I adjust scheduling policy (https://distributed.dask.org/en/latest/scheduling-policies.html) to speed up running large number of small tasks?Gregor Müllegger
10/02/2019, 8:18 AMGarrett Thomas
10/02/2019, 11:17 AMJeff Yun
10/02/2019, 8:40 PMemre
10/03/2019, 10:53 AMSecrets
store, say AWS Secrets ManagerRui Loureiro
10/03/2019, 6:16 PMJeff Yun
10/04/2019, 12:04 AMMitchell Bregman
10/04/2019, 1:44 PMAiden Price
10/05/2019, 10:47 AMJason
10/05/2019, 3:09 PMJason
10/05/2019, 3:22 PMAlex Cano
10/05/2019, 9:45 PMitay livni
10/07/2019, 2:24 PMflow
did run correctly? any suggestions?Chris Eberly
10/08/2019, 12:55 PMJulio Azeredo
10/08/2019, 3:45 PMChris Eberly
10/08/2019, 5:07 PMArgemiro Neto
10/08/2019, 5:32 PM[2019-10-08 17:26:55,377] ERROR - prefect.TaskRunner | Unexpected error: TooManyRequestsException('An error occurred (TooManyRequestsException) when calling the Invoke operation (reached max retries: 4): Rate Exceeded.')
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 869, in get_task_run_state
self.task.run, timeout=self.task.timeout, **raw_inputs
File "/usr/local/lib/python3.7/site-packages/prefect/utilities/executors.py", line 79, in timeout_handler
return fn(*args, **kwargs)
File "/Users/aneto/Documents/github/data-platform/scheduler/main.py", line 244, in run_sync_loads
Payload=json.dumps(load)
File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.TooManyRequestsException: An error occurred (TooManyRequestsException) when calling the Invoke operation (reached max retries: 4): Rate Exceeded.
The function has a 15min timeout and it is being called as:
client = boto3.client('lambda')
sync_results = client.invoke(
FunctionName='myfunction',
Payload=json.dumps(load)
)
The code following this invocation is not being called. I noticed that even before the Prefect retry the function is being called more than once.Aiden Price
10/09/2019, 9:43 AMmerge
two conditional branches back together, if the result of the previous step was a Pandas dataframe you get these errors; [2019-10-09 09:27:02,948] INFO - prefect.TaskRunner | Task 'Merge': Starting task run...
[2019-10-09 09:27:02,961] ERROR - prefect.TaskRunner | Unexpected error: ValueError('The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().')
...
[2019-10-09 09:27:02,965] INFO - prefect.TaskRunner | Task 'Merge': finished task run for task with final state: 'Failed'
Aiden Price
10/09/2019, 9:44 AMMerge
Task comparing the result to the NoResult
object.Aiden Price
10/09/2019, 9:45 AMmerge
, perhaps my own task to determine which of a variadic number of results I want to pass on?itay livni
10/10/2019, 8:50 PMpass
on a if else branch?Tobias Schmidt
10/11/2019, 7:16 AMJoe Howarth
10/13/2019, 2:59 PMBrad
10/14/2019, 8:44 PMmatta
10/15/2019, 1:01 AMitay livni
10/15/2019, 2:34 AMFlow
I am looking to: (1) run a set of tasks
(2) do something with the result in another `task`i.e. munge data. What is the suggested way to implement this? Is it using set_upstream
? the example I see
say_hello.set_upstream(second_add, flow=flow)
waits for a specific task to finish not a set of tasks. Does this get handled under the hood?itay livni
10/15/2019, 2:34 AMFlow
I am looking to: (1) run a set of tasks
(2) do something with the result in another `task`i.e. munge data. What is the suggested way to implement this? Is it using set_upstream
? the example I see
say_hello.set_upstream(second_add, flow=flow)
waits for a specific task to finish not a set of tasks. Does this get handled under the hood?Chris White
10/15/2019, 10:49 AMitay livni
10/15/2019, 5:49 PM