Tim-Oliver
11/14/2022, 1:57 PMTim Galvin
11/14/2022, 2:00 PMDaskTaskExecutor
and I am seeing logs produced in my tasks (provided a prefect.get_run_logger
is used to get an appropriate logger) in the prefect UI and flow run without any issuesTim-Oliver
11/14/2022, 2:12 PMdask_jobqueue.SLURMCluster
as executor and the logs are not getting back from the workers.DaskTaskExecutor
(on the local machine) works.Tim Galvin
11/14/2022, 3:01 PMsubprocess.run
where a singularity command was issued. Said command would run an MPI application within the container, and an appropriate srun
would get the command to run across many compute nodes. This command produced a large set of outputs on stdout
that I would capture through the normal subprocess
result object, and then manually <http://logger.info|logger.info>
them. I found that these logs were not captured often, but everything ran successfully.
Eventually I converged towards having a subprocess.run("sleep 5")
type command just after my my <http://logger.info|logger.info>
command. When I did that my missing logs were always captured appropriately. My hunch is a simple one - the process managing the main orion server does not persist long enough to retrieve all outputs. This subprocess sleep command should block long enough for this exchange to carry out. At the time I worked through this I was not managing my own orion prefect server - I was relying on the main process to fire one up for the lifetime of the flow.Tim-Oliver
11/14/2022, 4:19 PMa_task.submit()
or a_task.map(some_list)
. I tested it without submit
and then the logs get properly through as you said.