I'm on v2.0b7 and am trying to get a minimal code ...
# prefect-community
a
I'm on v2.0b7 and am trying to get a minimal code example to work with a Dask cluster. I can use DaskTaskRunner locally, but as soon as I add the TCP address to my test cluster, the system fails. I see a very lengthy error on the Dask worker that includes: "prefect.exceptions.PrefectHTTPStatusError: Client error '404 Not Found' for url" I am able to submit jobs successfully using Dask directly, just not through Prefect. Here is the code:
Copy code
from prefect import flow, task
from prefect_dask.task_runners import DaskTaskRunner

@task
def say_hello(name):
    print(f"hello {name}")

@flow(task_runner=DaskTaskRunner(address='tcp://[redacted]:8786'))
def greetings():
    say_hello('test')

if __name__ == "__main__":
    greetings()
Any ideas?
1
z
It sounds like your Dask worker cannot contact the Prefect API
(Since it’s our HTTPStatusError, it’s coming from our client)
What’s your
PREFECT_API_URL
set to? (
prefect config view
) Is it contactable from the Dask cluster?
a
Thanks for the quick response! The worker machine indeed has no internet connection. Makes perfect sense, addressing now.
Hi again, I worked with IT on this and made sure that all worker machines have internet access. However, this did not solve the problem. I am still getting the same error. How would I confirm that the Dask cluster can contact the PREFECT_API_URL? I tried running "ping [the url]" on one of the dask-worker machines and got "Ping request could not find host [the url]. Please check the name and try again." However, I was able to view the dashboard from that same worker.
I also tried "Test-NetConnection" in Powershell. Here was the result:
One more piece of information... the PrefectHTTPStatusError is for 'http://ephemeral-orion/api/task_runs/...' Should this instead be the same as the PREFECT_API_URL?
z
Yeah, that looks like
PREFECT_API_URL
isn’t set on your worker at all.
Is
PREFECT_API_URL
set on the flow run?
a
How would I set the
PREFECT_API_URL
for a flow run? Here's what I've tried: • Go to submitter machine (the machine where I run the originally posted code) • Launch the Anaconda Prompt and activate an environment ("dask-demo") that has Dask and Prefect installed • Run
prefect orion start
in the Anaconda Prompt for the submitter machine • Note the
PREFECT_API_URL
that is printed after the Prefect logo in the console • Go to the machine that will be the dask-scheduler • Launch the Anaconda Prompt and activate "dask-demo" • Run
prefect config set PREFECT_API_URL=[the url]
where [the url] is the previously noted
PREFECT_API_URL
• Run
dask-scheduler
, note the TCP address • Go to the machine that will be the dask-worker • Launch the Anaconda Prompt and activate "dask-demo" • Run
prefect config set PREFECT_API_URL=[the url]
• Run
dask-worker [the tcp address]
• Go to the submitter machine and launch another instance of the Anaconda Prompt and activate "dask-demo" • Run
python [full path to my test script]
to run the originally posted code After following those steps I see "Crash detected!" and flow run "Finished in state Failed" in the prompt on the submitter machine. I also see
prefect.exceptions.PrefectHTTPStatusError: Client error '404 Not Found' for url '<http://ephemeral-orion/api/task_runs/.../set_state>
z
Thanks for the detailed description! The API URL is passed from the flow run to the task runs. So you’ll want to set the
PREFECT_API_URL
on the submitter machine. We’ll automatically set it within each task run when they are run on the Dask worker. If this URL needs to be different for the flow run and task run then I think you’ll run into problems as I don’t think we support splitting this yet.
Since you’re running the flow on the submitter machine without setting an API URL, it’s using an ephemeral version of the API instead of talking to the one you ran with
prefect orion start
. You still see your runs show up because the ephemeral API is talking to the same database.
a
Okay amazing, it looks like it's working! And great to know that only the submitter machine needs the
PREFECT_API_URL
. One assumption I made was that the submitter machine would already know that URL path since that's the machine where the
prefect orion start
was running. However, I now understand that since Prefect Orion and the python script were each running in separate conda prompt instances, that API URL was only known to the Prefect Orion conda prompt instance. I went ahead and removed the API URL info from all of the dask machines and confirmed that indeed all I need to do is run
prefect config set PREFECT_API_URL=[the url]
and then run the python script from that same conda prompt instance. Thank you again so much for the help!
👍 1
z
Wonderful! cc @terrence we’ll need documentation on this pattern in the future.
1