Evan Curtin
05/06/2022, 6:12 PMAnna Geller
Evan Curtin
05/06/2022, 6:24 PMKevin Kho
#!/bin/bash
echo $DB_IS_DRIVER
/databricks/python/bin/pip install prefect dask distributed fugue[sql,duckdb] tune fs-s3fs optuna
if [[ $DB_IS_DRIVER = "TRUE" ]]; then
dask-scheduler &>/dev/null &
else
dask-worker tcp://$DB_DRIVER_IP:8786 --nworkers 4 --nthreads 1 &>/dev/null &
fi
but really, don’t do this lol. All packages need to be installed before dask too. I think order matters.
But with Databricks connect you can just do:
@task
def spark_thing():
spark = SparkSession.builder.getOrCreate()
spark.createDataFrame(...)
Evan Curtin
05/06/2022, 8:09 PMif [[ $DB_IS_DRIVER = "TRUE" ]]; then
dask-scheduler &>/dev/null &
else
dask-worker tcp://$DB_DRIVER_IP:8786 --nworkers 4 --nthreads 1 &>/dev/null &
fi
so I could potentially run this right?Kevin Kho