Anyone here from the prefect-databricks contributo...
# prefect-contributors-archived
e
Anyone here from the prefect-databricks contributor? I do have an important question 🙂
n
hey @Edmondo Porcu - I've done a lot of work on the collections (not that one exactly), i can probably help. What's up?
e
there is a flow that we use, “job runs submit and wait”. It uses: 1. one task to submitting the job on databricks 2. one flow that will poll the job until it’s done creating tasks for polling Some of our jobs are really long running, and we don’t have any way to make the job id that is created by submitting the job available on prefect UI. One obvious solution would be to add a parameter to the current job and introduce an “interceptor”, however this would break the signature because it uses a kwargs argument
for example we were considering using artifacts and log this value as an artifact. https://github.com/PrefectHQ/prefect-databricks/blob/main/prefect_databricks/flows.py#L251 I was considering making the change myself an opening a PR to prefect-databricks, but I don’t really know how to add an interceptor without breaking the signature
n
One obvious solution would be to add a parameter to the current job and introduce an “interceptor”, however this would break the signature because it uses a kwargs argument
it might just be me being naive about databricks but can you clarify what you mean by this? my gut reaction is that we could add a
handler: Optional[Callable]
to that flow and send the ID to it if its passed, but also wonder if that would be over-pinning to your exact use case. is that roughly what you meant by "interceptor" though?
e
it’s exactly right
Our users on Prefect expect the job id on Databricks/the url to be available for troublehsooting and visiting the Databricks url. If a job lasts 10 hours, it becomes available after 10 hours. I would love the handler parameter. won’t that however be a breaking change to the python signature?
n
I don't believe an optional kwarg added to the flow would necessarily be a breaking change, since if you aren't aware of the change and do not provide a
handler
, then everything would be the same for you feel free to open an
enhancement
issue requesting this (or even better a PR 🙂 ) and I can float the idea with our collection maintainers
Thanks @Nate how does the release cycle for the tasks / flows and work?
n
we don't have a cycle for the collections at this point, its just basically when something new gets added that people want we'll cut a release, like in this case so I'll put a PR up to update release notes (or you could do that quick before we merge this, no biggie), get that merged and then I can cut a release
e
đź’Ż
thanks
n
hey @Edmondo Porcu - v0.2.1 is released!
e
@Nate one of the problem we face with subflows is that when we log artifacts, these are logged in the Databricks subflow (wait something) rather than in the parent flow
Can we retrieve the parent flow context in Prefect?
n
generally we suggest you pass what you need from parent contexts to sub contexts
e
yeah that makes sense, we are going to rertrieve the current flow id
n
👍