Can i use docker-storage for flow with ubtuntu 20.04 as a base image not prefect in Dockerfile?
03/02/2022, 6:46 AM
You sure can! As long as the proper dependencies and entrypoint are set that it. Note however that installing python in a “raw” ubuntu is not as straightforward as you might think (been there done that). So why would you want to do that?
03/02/2022, 7:05 AM
I need to install spark and when using ubuntu as base image while registering flow it fails health checks and does not register the flow
03/02/2022, 8:30 AM
But you can use a Python image, install JDK and pip install spark. It should install all it's dependencies
Maybe you can even start from Prefect’s base image for that?
03/02/2022, 11:34 AM
To add to Matthias excellent answer, if you are running Spark just within a Docker image, you are not taking advantage of Spark right? Where is your Spark cluster running? Usually, you should be able to just run a "Spark submit" command with e.g. ShellTask to submit the job to your cluster
Also, we've already had this conversation about Spark and packaging it into a Docker image, let me check if I can find it