Ryan Sattler
08/27/2021, 1:40 AMnicholas
Gaylord Cherencey
08/27/2021, 2:26 AMKevin Kho
spark-submit
and that’s how you would connect to the Kubernetes Spark cluster?
If your Spark is already configured, wouldn’t just instantiating SparkSession inside a flow work?
And to what Nicholas said, we’d surely welcome PRs for this.Gaylord Cherencey
08/27/2021, 3:46 AMspark-submit
command but I would say it might be possible to instantiate a SparkSesssion inside the Flow (will have to give it a try). I did not look but is it how the Databricks one is working (session)?Kevin Kho
databricks-connect
library that hijacks your Spark installation so import pyspark
and creating the SparkSession
compiles the DAG locally, then sends them to the configured cluster when there is an action.Gaylord Cherencey
08/27/2021, 4:10 AMpyspark