https://prefect.io logo
#prefect-community
Title
# prefect-community
d

David Elliott

10/06/2022, 4:29 PM
Hey, I’m finding
logger
doesn’t seem to output logs either to terminal or to the cloud UI when using the
DaskTaskRunner()
- I’m calling
get_run_logger()
within the task - code attached in 🧵 . Any ideas / am I missing something? (works fine with sequential / concurrent task runners, just not the Dask one)
Untitled.py
run local output.sh
Prefect version: 2.4.2 prefect-dask==0.2.0
z

Zanie

10/06/2022, 4:31 PM
This is a known issue unfortunately
d

David Elliott

10/06/2022, 4:33 PM
Ah, ok! Do you think it’ll be worked on for a future version? (and does it affect Dask on kubernetes as well as locally?)
z

Zanie

10/06/2022, 4:47 PM
Sorry got distracted in the middle of my follow-up message 😄
😄 1
Basically: Logging is configured per-process (this is just how Python does it), and we need to ensure that your logging configuration has been applied to the given process at the start of each run. However, you can’t configure logging more than once or weird things happen. So we’re in this weird place where we store a global variable indicating if logging has already been configured and just ignore future calls to apply the logging config if its set.
I suspect that this global variable is getting passed to Dask workers somehow so logging is not reconfigured on a new process.
I’m not really sure though, it’s on the end of a long todo list to investigate.
It’ll definitely work on a future version. I think this affects any Dask setup.
👌 1
d

David Elliott

10/06/2022, 4:53 PM
Cool, thanks for explaining! It might be a bit of a blocker for us migrating 1.0 -> 2.0 if we can’t log our task logs to cloud, so hoping you are able to get to the bottom of it at some stage..! _(we’re quite near the upper limit of 3000 tasks in 1.0 so were hoping to migrate in the coming weeks…_😬 )
z

Zanie

10/06/2022, 5:54 PM
What do you need Dask for?
d

David Elliott

10/06/2022, 6:10 PM
So we can execute a lot of SQL scripts in parallel (not concurrently) - atm with 1.0 we have Dask on k8s (4x worker pods, 16 threads each) running up to 64 SQL scripts in parallel. Just noticed the Ray task runner which I’ve not heard of - any idea if there are similar logger issues there? (and whether it works on k8s?). Not wedded to dask if there’s an alternative that can allow us to run tasks in parallel on k8s…
r

Robert Hales

10/12/2022, 12:30 PM
Hi there, may have some info that will be helpful when looking into this. Dask does not want to log when the worker is running locally, however, if you move it to a different machine (even a docker container) logs begin to work.
d

David Elliott

10/12/2022, 1:25 PM
Ohh interesting - I’m actually just about to start testing on k8s (running within a docker container) so will see how it looks there - thanks..!
r

Robert Hales

10/12/2022, 1:27 PM
I was happily surprised when my logs where coming through when running in ecs, though it was going to block me!
🎉 1