Hi I am using prefect to orchestrate my spark jobs...
# prefect-server
s
Hi I am using prefect to orchestrate my spark jobs. The spark jobs are submitted with
spark-submit
using prefect’s
ShellTask
. I have created a subclass of ShellTask to invoke the
spark-submit
.The spark jobs are running on k8s. There seems to be an issue especially with long running tasks where the spark jobs completes but the prefect task in unaware of the status and continues keep running. I have tried few suggestions including Added to the following configuration to job template
“<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>”: “false”
It looks like
ShellTask
has some issue in maintaining the status To overcome this I am thinking the following approach: - In map of jobs say
["j1", "j2", "j3"]
, submit all the jobs and not wait for execution completion (In my case since there is no dependency between tasks) - Separate task to monitor the status for all the tasks. This task runs in a loop say for max 1 hour before timeout. This task looks at status for each task and updates the overall status What is the best way to accomplish this looped task on prefect ?
a
Do you have strong reasons for using Spark instead of Dask or Ray? Not to influence you here, but Prefect 2.0 has seamless integration with Dask and Ray via task runners, which would make the entire process way easier to configure and manage and would give you much more visibility when something goes wrong in your distributed workflow computation.
Configuring this as you described with Prefect Server and Spark running on Kubernetes is actually really hard and I haven't seen anyone in the community yet doing that. I saw some users submitting jobs to Databricks or AWS EMR, which is possible because those have an API allowing you to submit Spark jobs without having direct access to the master node, but in your setup you would somehow need to deploy a Prefect agent on the Spark master node to submit the job via
ShellTask
(I think). Maybe someone from the community can chime in but I wouldn't go that route unless you really have to
s
We already have existing spark jobs and skillset and would want to utilize them. Regarding the spark on k8s, we have the cluster running and working with prefect. The only issue is if the job is long running then prefect seems to lose track of it
a
Yup, you would need to supply a master node's host to spark submit:
Copy code
# Running Spark application on Kubernetes cluster
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master <k8s://192.168.231.132:443> \
  --deploy-mode cluster \
  --executor-memory 5G \
  --executor-cores 8 \
  /spark-home/examples/jars/spark-examples_versionxx.jar 80
there is an option to submit via REST actually, but on K8s the above is preferred
I see - this is a different problem then. This topic explains it incl. some steps you may take to mitigate the issue
s
Thanks, will check this out
👍 1
635 Views