Hi, quick question about task slugs. Context: I'm ...
# ask-community
k
Hi, quick question about task slugs. Context: I'm executing my flow and it fails. I have slugs of the tasks that failed (from flow.slugs[<task>]) and based on those I'd like to retrieve a task so that I can investigate it's dependencies using flow.upstream_tasks() However when I look at flow.tasks, for each task the slug is null and flow.get_tasks(slug=X) doesn't work. I could do this by reverse-mapping flows.slugs which seems to be populated, but it feels like a hack, so I wanted to ensure I'm not missing a simpler way first. Is that expected behaviour or I am missing something?
k
Hey @Krzysztof Nawara, I don’t know this immediately. I’ll look into it for you.
Are you trying to do this in a state handler? Or trying to do this in another script without the Flow defnition?
k
I'm running Prefect in "local" mode, in my JupyterLab
So I have full access to the flow definition, which I later run to get the state
I've got some strange parts in my setup, but I'll try to create a minimal example and see if it also behaves like that
k
Oh so you don’t register and run it against Cloud or Server? Asking because we have a mechanism that lets you retrieve this information by querying the GraphQL API.
k
No, I'm doing this locally - it's a quasi-ML pipeline so I want to have full access to task results in the notebook for analysis Looks like it suffers from the same issue when running a basic pipeline (that fails): https://pastebin.com/bb7cNHe7
k
Ok will ask the team about about
k
small correction, in previous example for some reason state s ended up being NoneType, now after re-running it's Failed
k
So the task slugs are populated when the flow is serialized (for registration to the backend). When running on local, this won’t be populated. The suggestion is to try using the
flow.slugs
dictionary instead of the
task.slug
k
Gotcha, thanks! It also means that Flow.get_tasks(slug=X) is going to be broken on local, which you may or may not consider to be a bug (depending on your point of view). But it was definitely unexpected (for me at least)
k
This is on the radar for sure (and there are some other differences between running with a backend). No promises though when it would be changed. We don’t have a lot of users purely running on local. Would love to learn more about your use case. Also, just wanna be sure you’re aware Cloud is free to get started with.
k
That's my goal, to make sure you're aware 🙂 As for the usecase, for me the easy access to task results when using flow.run() is a huge benefit of the local execution. It's big enough that I'm actually running using a backend, saving task results, copying them back to the local folder and then running the same pipeline in local mode - it just loads saved task results. This allows me to quickly retrieve results by tags, or by slug or by task name - makes analysis that much easier. Almost like in Metaflow https://docs.metaflow.org/metaflow/tagging If there is a better way to do it I'd be happy to hear about it
👍 1