And it took awhile to come to me as a requirement but I essentially have 2 steps.
Step one requires docker image A to do some data pipeline stuff, step 2 needs my custom dbt docker image B to do some transform AFTER step 1. So these two dependent tasks constitute one flow with each step on independent docker images.
A flow runner is configured at a deployment level but I dont see a way to configure it at the task or subflow level. Definitely a key req in current state.
Please help!
This is a roadmap item, but not one that we expect to solve until later this year.
π 1
Zanie
04/18/2022, 3:40 PM
You can also use dockerpy to run docker containers from tasks using a Python API, you donβt have to fall back to shell.
π 1
π 1
a
Alexander Butler
04/18/2022, 6:40 PM
@Zanie made my day. Sometimes it's the simple things, y'know. I wasn't thinking outside the box, or inside. Or wherever. Just in the Prefect box. docker-py is a perfect shim here.
π 1
a
Anna Geller
04/19/2022, 3:04 PM
It's actually a good analogy because Prefect is more about the arrows than the boxes - check this blog post
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.