Hi! I’m wondering if anybody with Databricks Task ...
# ask-community
r
Hi! I’m wondering if anybody with Databricks Task experience can help me with this. I would like to capture the results of a Spark job to inform next steps in my flow. I can successfully submit Databricks jobs, but I didn’t see anyway to get the results. Looking further, I see a DatabricksHook that has a get_run_page_url() method that would get me close to what I need, but any attempt to use Databricks gives me a “None Type is not iterable…” exception. Here is my code:
Copy code
@task
def get_url(run_id, hook):
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>("RUN ID: %s", run_id)
    url = hook.get_run_page_url(run_id=run_id)
    return url

SubmitRun = DatabricksSubmitRun()

with Flow("test_databricks", storage=STORAGE, run_config=RUN_CONFIG) as flow:
    conn = PrefectSecret('DATABRICKS_CONNECTION_STRING')
    json = get_job_config()
    run_id = SubmitRun(json=json, databricks_conn_secret=conn)
    hook = SubmitRun.get_hook()
    url = get_url(run_id, hook)
Here is the exception:
Copy code
ERROR:prefect.TaskRunner:Unexpected error: TypeError("argument of type 'NoneType' is not iterable")
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/prefect/engine/task_runner.py", line 863, in get_task_run_state
    value = prefect.utilities.executors.run_task_with_timeout(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/prefect/utilities/executors.py", line 298, in run_task_with_timeout
    return task.run(*args, **kwargs)  # type: ignore
  File "/Users/rbastian/enverus/RAI/prefect/flows/test-databricks.py", line 32, in get_url
    url = hook.get_run_page_url(run_id=run_id)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/prefect/tasks/databricks/databricks_hook.py", line 248, in get_run_page_url
    response = self._do_api_call(GET_RUN_ENDPOINT, json)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/prefect/tasks/databricks/databricks_hook.py", line 148, in _do_api_call
    if "token" in self.databricks_conn:
Thanks in advance!