Thread
#prefect-community
    q

    Qin XIA

    1 year ago
    Does anyone have tried run a pyspark ( spark cluster in local docker ) task from a local agent in Docker without using bash command ?
    a

    Andor T贸th

    1 year ago
    Only in AirFlow
    but it should be no different in Prefect
    That's the gist of the Spark hook I'm using
    spark_conf = SparkConf()
    spark_conf.setAll(self.conf.items())
    spark_session = SparkSession.builder.config(conf=spark_conf).getOrCreate()
    the prerequisite is to have a matching
    pyspark
    package installed in the PYTHONPATH used by the agent
    q

    Qin XIA

    1 year ago
    i agree with you and that's what i want to try. But need to use spark-submit to run pyspark script, no (prefect flow would be run by python or python3 command)
    a

    Andor T贸th

    1 year ago
    so you want to submit a python script without using cli?
    q

    Qin XIA

    1 year ago
    I have tried add function findspark in my code and it works
    a

    Andor T贸th

    1 year ago
    Could you share the details!?
    q

    Qin XIA

    1 year ago
    I created a flow ( ex : flow.py) prefect who has some pyspark task. Creat all these elements as usual. Run this flow with command python flow.py, first time it donesn't work because PATH PYSPARK was not configured. So I just add function findspark in order to initialize PATH PYSPARK and it works.
    I think if you configured your pyspark path in env variable, it should work also