https://prefect.io logo
Title
d

Dmitry Dorofeev

08/31/2019, 9:20 PM
Hi all, assuming I have a config file with unknown number of jobs defined. Each job should run in parallel. I can easily define a prefect Flow for each job, but how can I fire all Flows from one python process? What if I want to fire each Flow periodically with cron scheduler? Currently I fork() separate python process for each Flow, is that right?
a

Alex Cano

08/31/2019, 9:42 PM
Is there any reason you’d like one flow per config file as opposed to one flow with many tasks? Are all tasks the same? You could map across each config job if they are
d

Dmitry Dorofeev

09/01/2019, 7:06 AM
One flow would be fine, except the case when some tasks will take more time than expected. If flow is scheduled say every minute, long running task will prevent next scheduled flow to run. So all tasks (in the next scheduled flow) will wait for slow task (in the running flow) to finish
j

Jeremiah

09/01/2019, 1:57 PM
Hi @Dmitry Dorofeev, the
flow.run()
convenience function only supports one run at a time, as you noticed. Running separate Python processes is completely fine, you could also use multiprocessing or Dask to gain that asynchronous parallelism if you wanted.
d

Dmitry Dorofeev

09/01/2019, 2:25 PM
Is it safe to use threads instead of old good fork()?
j

Jeremiah

09/01/2019, 2:27 PM
We believe that Prefect itself is generally thread-safe (as long as you design your tasks to be thread safe!). For example, the shared context is a
ThreadLocal
object, and we unit test a threaded executor. However, we have observed problems when people use libraries like the Google Cloud clients which seem not to be thread safe.
So we would say — cautious yes?
We just wouldn’t advise people to assume that Prefect instantly makes their custom code thread-safe.
d

Dmitry Dorofeev

09/01/2019, 8:20 PM
Yes, that’s understandable. I guess calling different flow.run() in separate threads is safe then.
I mean one flow per thread
j

Jeremiah

09/02/2019, 1:06 AM
In all honesty, I don’t know if we can declare it to be safe — tasks that aren’t thread-safe will bite you even if each one is run in a flow in a different thread. However, if you’re sure your tasks are safe, I think you can do this.