https://prefect.io logo
#prefect-community
Title
# prefect-community
t

Tom Klein

07/10/2022, 9:54 AM
Hey again 🙋 We’re trying to test out the (ad-hoc) Dask cluster option for execution but we’re getting errors it and seems like we’re missing something (code in the thread)
1
so we’re getting this error at the moment:
Copy code
Unexpected error: RuntimeError('Missing dependency kubectl. Please install kubectl following the instructions for your OS. ')
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/prefect/engine/flow_runner.py", line 442, in get_flow_run_state
    with self.check_for_cancellation(), executor.start():
  File "/usr/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/usr/lib/python3.9/site-packages/prefect/executors/dask.py", line 243, in start
    with self.cluster_class(**self.cluster_kwargs) as cluster:
  File "<string>", line 215, in <lambda>
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/classic/kubecluster.py", line 503, in __init__
    super().__init__(**self.kwargs)
  File "/usr/lib/python3.9/site-packages/distributed/deploy/spec.py", line 264, in __init__
    self.sync(self._start)
  File "/usr/lib/python3.9/site-packages/distributed/utils.py", line 336, in sync
    return sync(
  File "/usr/lib/python3.9/site-packages/distributed/utils.py", line 403, in sync
    raise exc.with_traceback(tb)
  File "/usr/lib/python3.9/site-packages/distributed/utils.py", line 376, in f
    result = yield future
  File "/usr/lib/python3.9/site-packages/tornado/gen.py", line 769, in run
    value = future.result()
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/classic/kubecluster.py", line 647, in _start
    self.forwarded_dashboard_port = await port_forward_dashboard(
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 110, in port_forward_dashboard
    port = await port_forward_service(service_name, namespace, 8787)
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 75, in port_forward_service
    check_dependency("kubectl")
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/common/utils.py", line 38, in check_dependency
    raise RuntimeError(
RuntimeError: Missing dependency kubectl. Please install kubectl following the instructions for your OS.
is having
kubernetes
as an extra for the prefect pip installation, not enough?
this is how the Flow is configured (sensitive info redacted):
Copy code
with Flow(
        FLOW_NAME,
        result=s3_result,
        executor=DaskExecutor(
            cluster_class=lambda: KubeCluster(make_pod_spec(image=prefect.context.image)),
            adapt_kwargs={"minimum": 2, "maximum": 2},
        ),
        run_config=KubernetesRun(
            image="xyz_my_image",
            memory_request="32Gi",
            env=my_env,
        ),
) as flow:
is it really necessary to add the :
Copy code
curl -LO <https://dl.k8s.io/release/v1.21.2/bin/linux/amd64/kubectl> &&\
        install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl &&\
lines to our docker image? i thought that Prefect comes with the necessary Dask/Kubernetes dependencies?
a

Anna Geller

07/10/2022, 10:29 AM
The place from which you run it needs access to the Kubernetes cluster to which you want to deploy a Dask cluster, that's why it seems plausible that kubectl must be installed Can you try installing that and see if it fixes the missing dependency error from dask_kubernetes?
t

Tom Klein

07/10/2022, 10:31 AM
ya, im trying it now (didn’t notice those lines at first) - but ya, in the beginning it was a permission issue (of starting a new service, for example, i suppose for the cluster scheduler) but that was sorted out by our DevOps and now we have this.. just not sure i understand - i thought Prefect comes with
dask-kubernetes
packaged along at least the
Kubernetes
extra (and that the latter would install kubectl if necessary)
a

Anna Geller

07/10/2022, 10:35 AM
No it doesn't. To be fair, George has already explained it quite well in the post 😜 "Adding in dask-kubernetes as a dependency will cause it to be installed using pip during the container build process. Kubectl CLI is a dependency of dask-kubernetes, this is installed by the RUN command in the extra_dockerfile_commands argument."
t

Tom Klein

07/10/2022, 10:38 AM
it doesn’t answer the question fully though - doesn’t (or shouldn’t) Prefect have this dependency built-in? i read in multiple places that Dask is considered a “first class citizen” in Prefect so i’m trying to figure out what this means in practice
and no, it didn’t work 😞 still same error
i verified that
kubectl
was successfully installed but seems like it’s missing config:
Copy code
Step 12/23 : RUN kubectl version                                                                                                                                                                                                 
 ---> Running in 6f3b72feadad                                                                                                                                                                                                    
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.                                                              
The connection to the server localhost:8080 was refused - did you specify the right host or port?                                                                                                                                
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Plat
form:"linux/amd64"}                                                                                                                                                                                                              
Kustomize Version: v4.5.4                                                                                                                                                                                                        
[SYSTEM]                                                                                                                                                                                                                         
 Message             Failed to build image: honeybook/ds-lead-enrichment:2022                                                                                                                                                    
 Caused by           Container for step title: Building Docker Image, step type: build, operation: Building image.
this is what i don’t get - do we need to handle these low-level configs as part of working with prefect? i thought prefect takes care of the interaction with the kubernetes cluster for us (e.g. we didn’t have this issue when running Kubernetes tasks)
(btw, place from which we run is K8s itself, since the flow is running as a k8s job)
a

Anna Geller

07/10/2022, 11:55 AM
There are always trade-offs - we try to minimize the required dependencies in the core package so if you need Dask on Kubernetes, you install it yourself when needed - this helps to keep things lightweight which is generally good and more pluggable
In Prefect 2.0 this separation is even more explicit and clear - if you want to run the flow on Kubernetes, use Kubernetes flow runner. If you want to run tasks on Dask, you use Dask task runner and provide the Dask cluster configuration on that Dask task runner rather than on the config deciding how to deploy the flow itself
And for dependency management it's also more explicit: you install prefect-dask package for it and then given that Dask can be configured in tens of different ways, you can again customize it based on where this cluster is or should be deployed to
t

Tom Klein

07/10/2022, 12:00 PM
hmm, ok - but even with all dependencies installed, it still seems like there’s a lot of extra low-level config required to make it work — e.g. right now i’m trying to install
aws cli
because it supposedly gives the ability to fetch/create a config file for
kubectl
automatically - and now it’s failing because despite our Prefect deployments (e.g. the agent, the jobs) having a k8s servicerole with full AWS permissions, it still tells me:
Copy code
Removing intermediate container b69c69743ae3                                                                                                                                                                                     
 ---> 4c2d52aa2e3a                                                                                                                                                                                                               
Step 13/24 : RUN aws eks update-kubeconfig --region us-east-1 --name prefect                                                                                                                                                     
 ---> Running in deb16c5fa885                                                                                                                                                                                                    
Unable to locate credentials. You can configure credentials by running "aws configure".
i just hoped that we wouldn’t have to get to these kind of low-level configurations when using Prefect and that once we set it up it simply integrates with what we need (given that it has the permissions) i thought it’s mostly a matter of giving it an executor with an integer representing the number of workers and an image
when i ran kubernetes tasks (e.g.
RunNamespacedJob
) i didn’t have to do any of that stuff
a

Anna Geller

07/10/2022, 12:02 PM
Kubernetes and Cloud permissioning is an art of itself I recommend IAM roles for Service Accounts which are easy to set up with eksctl package and are recommended best practice to manage AWS permissions on AWS EKS -- we've had this conversation before :)
t

Tom Klein

07/10/2022, 12:02 PM
this is already what we have. we’re not missing permissions
a

Anna Geller

07/10/2022, 12:04 PM
In that case you would need to cross check with your admins - the IAM roles are not working if you're getting this permission error
😞 1
t

Tom Klein

07/10/2022, 12:05 PM
i think it’s because we’re talking about the Docker build phase here, not the actual k8s pod. it’s the CI/CD which apparently is missing the credentials, not a deployed pod (and even if it had the credentials, i dont think it makes much sense for our CI/CD to fetch configs from AWS, now that i think about it) is there an option to circumvent this by running the config script when the k8s job (flow) is created, rather than rely on the image? or in other words, is there an easy way to add arbitrary code to run (that isn’t a “task”) on an existing (docker-image based) KubernetesRun ? it can’t be a task because i guess it needs to run before the executor is instantiated?
i guess it’s something like
ENTRYPOINT
for docker?
a

Anna Geller

07/10/2022, 12:18 PM
for Docker build, you shouldn't need that because during build you don't run any code interacting with AWS, you only build it for local Docker run, you can attach your .aws folder to mount credentials --volume ~/.aws:/root/.aws or in Docker storage: volumes=["/Users/anna/.aws:/root/.aws"] check: • https://discourse.prefect.io/t/how-to-bind-mount-a-volume-to-a-flow-run-container-for-e[…]unting-a-volume-with-aws-credentials-to-a-docker-agent/595https://discourse.prefect.io/t/how-to-pass-aws-credentials-to-my-prefect-1-0-flows/667
CI/CD tools have special credentials handling for that - I can give you example for Github actions
t

Tom Klein

07/10/2022, 12:19 PM
Yes, i know - but i don’t think it makes sense to run this configure in the docker necessarily (basically, i don’t think it makes sense to communicate with the k8s cluster from the CI/CD tool) i’ll consult with our devops
👍 1
a

Anna Geller

07/10/2022, 12:19 PM
Copy code
- name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
t

Tom Klein

07/10/2022, 12:20 PM
the interaction with AWS is necessary to get the kubectl configuration programmatically, apparently, because it (the dask executor) cannot seemingly “detect” what k8s cluster it’s running on (unlike Prefect itself?) if it’s using an external tool like
kubectl
btw, the
prefect.context.image
is a legitimate reference? cause i don’t see it in the docs: https://docs.prefect.io/api/latest/utilities/context.html#context-2
@Anna Geller we managed to get it to create the dask cluster successfuly (i think, we see it in the list of kube pods) but - we’re now getting:
Copy code
value = future.result()
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/classic/kubecluster.py", line 647, in _start
    self.forwarded_dashboard_port = await port_forward_dashboard(
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 110, in port_forward_dashboard
    port = await port_forward_service(service_name, namespace, 8787)
  File "/usr/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 94, in port_forward_service
    raise ConnectionError("kubectl port forward failed")
ConnectionError: kubectl port forward failed
i understand this is more of a dask-kubernetes issue than a prefect issue — but any chance you have any idea what’s happening?
a

Anna Geller

07/10/2022, 12:54 PM
you may dive deeper into KubeCluster configs to disable Dask dashboard, that may help https://kubernetes.dask.org/en/latest/kubecluster.html#dask_kubernetes.KubeCluster
worker spec looks promising
t

Tom Klein

07/10/2022, 12:59 PM
why do you suspect the
no-dahsboard
would help? the pod is created with everything — i suspect the issue might be more fundamental
this is from the pod’s logs:
o

Oliver Mannion

07/10/2022, 1:33 PM
I think this occurs when KubeCluster tries to locate the dask-root service. If it fails, it assumes it is running outside the cluster and falls back to port-forwarding using kubectl, which requires kubectl.
🙏 1
t

Tom Klein

07/10/2022, 1:34 PM
@Oliver Mannion thanks for chipping in - do you have any idea then how can we solve this? im not sure what the “dask-root service” is? We have kubectl installed there, so it being missing isn’t the issue at the moment
according to the dask docs it seemed that the scheduler runs locally by default and only the workers are remote (i don’t have any special problem with that for an ephemeral run like that? )
i guess i’m wrong:
perhaps it’s related to the fact that prefect doesn’t seem to fully understand the “service name”?
o

Oliver Mannion

07/10/2022, 1:41 PM
The dask root service should be created in your cluster. Do you see it in
kubectl get service
?
🙌 1
t

Tom Klein

07/10/2022, 1:41 PM
@Oliver Mannion r u referring to this:
(i can’t run
kubectl
myself, only our devops have permissions for that, this ^^ is what they sent me to show that the pod was created)
i have a feeling that something doesn’t properly wire the name of the service then…? also, who said that the service type is :
ClusterIP
?
a

Anna Geller

07/10/2022, 2:27 PM
@Tom Klein I'd suggest starting with a local Kubernetes cluster running on Docker Desktop - once you get that working there, you can perhaps move to a playground AWS EKS with admin permissions as a PoC. Once that's working, you can move to your DevOps-managed and restricted cluster this way it's much easier to help as you are troubleshooting things step by step
t

Tom Klein

07/10/2022, 3:02 PM
what you’re suggesting makes a lot of sense but we don’t have enough time for the overhead on going the full path (especially since i’m not a devops and we don’t have enough real devops resources for too much experimentation vs. solving things directly). solving things “brute-force” directly has so far proved to be the faster route 🙂 we already have Prefect running and things are working there pretty well, and it feels like we’re pretty close to being able to launch a Dask Cluster on the fly too — but : • there’s not enough logs to be able to understand what’s going on and why it’s failing • someone ran into the exact same issue (apparently?) and solved it with just permissions so i suspect it might be something similar in our case — though from what i understand port-forwarding is a fallback so not clear why it should even reach that point also, there’s nothing really special about our use-case, seems like a pretty classic use-case for anyone who doesn’t wish to connect to a static (“pre-existing”) dask-cluster --- so it’s kinda strange to me that it doesn’t just work out of the box (the same way the
LocalDaskExecutor
does, for example) it’s even described as “your” (prefect’s) “preferred” way of running: https://stories.dask.org/en/latest/prefect-workflows.html
Copy code
Our preferred deployment of Prefect Flows uses dask-kubernetes to spin up a short-lived Dask Cluster in Kubernetes.
so it’s not like i’m inventing the wheel here, but still feels a bit like it
in your official example you use something called
Dask Gateway
- is that the “best practice” of how to create clusters on the fly? https://docs.prefect.io/core/advanced_tutorials/dask-cluster.html although, it’s not clear from this doc when exactly is the cluster created if it’s just being run from outside of the flow scope?
a

Anna Geller

07/10/2022, 4:08 PM
we don't have generic best practices for that as it depends a lot on your use case, infra, etc.
I don't have any more ideas on how to help, sorry about that if you can't get that working directly on your cluster, my advice is still to take it more step by step starting with a local Kubernetes cluster running on Docker Desktop, then playground AWS EKS with admin permissions, then your DevOps-managed and restricted cluster
t

Tom Klein

07/10/2022, 4:13 PM
@Anna Geller i did try to run it on a local k8s cluster now (minikube, not docker desktop, not sure if it makes a difference?) seems like i got the same thing:
Copy code
[2022-07-10 19:08:20+0300] INFO - prefect.FlowRunner | Beginning Flow run for 'lead_enrich_partial'
[2022-07-10 19:08:20+0300] INFO - prefect.DaskExecutor | Creating a new Dask cluster with `flows.partial_enrich.<lambda>`...
distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
distributed.scheduler - INFO - Clear task state
distributed.scheduler - INFO -   Scheduler at: <tcp://192.168.0.189:62569>
distributed.scheduler - INFO -   dashboard at:                     :8787
distributed.core - INFO - Event loop was unresponsive in Scheduler for 200.43s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
[2022-07-10 19:11:40+0300] ERROR - prefect.FlowRunner | Unexpected error: ConnectionError('kubectl port forward failed')
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/engine/flow_runner.py", line 442, in get_flow_run_state
    with self.check_for_cancellation(), executor.start():
  File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.9/site-packages/prefect/executors/dask.py", line 238, in start
    with self.cluster_class(**self.cluster_kwargs) as cluster:
  File "/Users/klay/dev/honeybook/hb-prefect/flows/partial_enrich.py", line 232, in <lambda>
    cluster_class=lambda: KubeCluster(make_pod_spec(image="<http://887300609994.dkr.ecr.us-east-1.amazonaws.com/hb-ds/lead-enrichment:2022|887300609994.dkr.ecr.us-east-1.amazonaws.com/hb-ds/lead-enrichment:2022>"),
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/classic/kubecluster.py", line 503, in __init__
    super().__init__(**self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/distributed/deploy/spec.py", line 260, in __init__
    self.sync(self._start)
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 310, in sync
    return sync(
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 364, in sync
    raise exc.with_traceback(tb)
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 349, in f
    result[0] = yield future
  File "/usr/local/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/classic/kubecluster.py", line 647, in _start
    self.forwarded_dashboard_port = await port_forward_dashboard(
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 110, in port_forward_dashboard
    port = await port_forward_service(service_name, namespace, 8787)
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 94, in port_forward_service
    raise ConnectionError("kubectl port forward failed")
ConnectionError: kubectl port forward failed
[2022-07-10 19:11:40+0300] ERROR - prefect.lead_enrich_partial | Unexpected error occured in FlowRunner: ConnectionError('kubectl port forward failed')
ERROR:prefect.FlowRunner:Unexpected error: ConnectionError('kubectl port forward failed')
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/engine/flow_runner.py", line 442, in get_flow_run_state
    with self.check_for_cancellation(), executor.start():
  File "/usr/local/Cellar/python@3.9/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.9/site-packages/prefect/executors/dask.py", line 238, in start
    with self.cluster_class(**self.cluster_kwargs) as cluster:
  File "/Users/klay/dev/honeybook/hb-prefect/flows/partial_enrich.py", line 232, in <lambda>
    cluster_class=lambda: KubeCluster(make_pod_spec(image="<http://887300609994.dkr.ecr.us-east-1.amazonaws.com/hb-ds/lead-enrichment:2022|887300609994.dkr.ecr.us-east-1.amazonaws.com/hb-ds/lead-enrichment:2022>"),
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/classic/kubecluster.py", line 503, in __init__
    super().__init__(**self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/distributed/deploy/spec.py", line 260, in __init__
    self.sync(self._start)
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 310, in sync
    return sync(
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 364, in sync
    raise exc.with_traceback(tb)
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 349, in f
    result[0] = yield future
  File "/usr/local/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/classic/kubecluster.py", line 647, in _start
    self.forwarded_dashboard_port = await port_forward_dashboard(
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 110, in port_forward_dashboard
    port = await port_forward_service(service_name, namespace, 8787)
  File "/usr/local/lib/python3.9/site-packages/dask_kubernetes/common/networking.py", line 94, in port_forward_service
    raise ConnectionError("kubectl port forward failed")
ConnectionError: kubectl port forward failed
ERROR:prefect.lead_enrich_partial:Unexpected error occured in FlowRunner: ConnectionError('kubectl port forward failed')
distributed.scheduler - INFO - Scheduler closing...
distributed.scheduler - INFO - Scheduler closing all comms
ERROR:asyncio:Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x122830970>
ERROR:asyncio:Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x12284b310>
however, i don’t think the flow was actually being run on (my local) Kubernetes despite the Kubernetes run config?
a

Anna Geller

07/10/2022, 4:14 PM
can you share the flow you ran but which caused those issues on local K8s?
some minimal reproducible example might be a good start
Can you try this example? https://github.com/anna-geller/packaging-prefect-flows/tree/master/ephemeral_dask_cluster it works for me on a K8s cluster locally with Docker Desktop
t

Tom Klein

07/10/2022, 4:41 PM
@Anna Geller i know the difference between a local dask executor and a non-local one… it doesn’t help me fire up the non-local one though 🙂 you mean - to try and test your example on our existing EKS? do you have some doc with an example of how to launch something like that locally (i just setup minikube, configured the RBAC, etc. but i feel like im getting lost in an irrelevant rabbit hole)
can we package the
yaml
into the registered flow (stored on S3) ? or does it need to be available on the image? i’m guessing the latter? wait, you’re not even using the
yaml
in your flow though?
ok, the example worked on our EKS so that’s a good sign 🙂 it leads me to believe maybe we’re somehow missing a dependency on the dask image? i see you’re using a custom one:
Copy code
annageller/prefect-dask-k8s:latest
can you share the Dockerfile for it?
interestingly, even when the dask image was set to the (very basic, and default) :
<http://ghcr.io/dask/dask|ghcr.io/dask/dask>
it still failed, as long as the Prefect flow ran on our custom image (despite it having
kubectl
,
dask-kuberentes
,
dask
, etc.) so it seems like a missing dependency on the “parent” (prefect flow / k8s job) rather than the “child” (dask scheduler or worker)
here is a Dockerfile that reproduces the problem:
Copy code
FROM node:12-alpine

WORKDIR /usr/src/app

RUN apk update
RUN apk add bash curl py3-pip
RUN apk add --no-cache  build-base libffi-dev openssl-dev g++ python3-dev
RUN pip3 install --upgrade pip setuptools wheel
RUN pip3 install numpy pandas
RUN pip3 install snowflake-connector-python --no-use-pep517
RUN pip3 install prefect[github,aws,kubernetes,snowflake,dask] --use-deprecated=legacy-resolver
RUN pip3 install dask
RUN curl -L "<https://dl.k8s.io/release/$(curl> -L -s <https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl>" -o /usr/local/bin/kubectl
RUN chmod +x /usr/local/bin/kubectl
a

Anna Geller

07/10/2022, 5:20 PM
Thanks for sharing your approach, it may help others who stumble upon the same issue. If the example worked on your cluster, are you good now?
t

Tom Klein

07/10/2022, 5:23 PM
@Anna Geller well, thanks to your help - i narrowed down the problem but i’m still pretty far from being good 😄 it still doesn’t work with out custom image which is critical to be able to run on dask (since it has the specific NodeJS script we are trying to run in parallel)
it only works with your image, or - the default prefect image along with :
Copy code
env={"EXTRA_PIP_PACKAGES": "dask-kubernetes"}
but our custom image has the same thing, which makes it a phantom problem…. either some sort of version conflict/mismatch or an issue with the fact we use Alpine linux
and --- it even fails when the just the parent (i.e. prefect flow) has this custom image, even if the dask tasks are fired with vanilla images
a

Anna Geller

07/10/2022, 6:32 PM
Do you believe that Dask is the right solution then? Dask is more to parallelize Python. Given that based on what I remember your tasks are Kubernetes tasks that submit work to other pods, to trigger your node applications in parallel, LocalDaskExecutor and mapping over a list of those Kubernetes tasks (where each will run in a separate pod anyway) seems to be sufficient. But of course I don't know the details so I may be wrong here.
t

Tom Klein

07/10/2022, 6:39 PM
@Anna Geller i thought about that, and it does make sense --- but in our particular case it makes writing the flow much more cumbersome, cause it means we need to adjust the code in the NodeJS to be “prefect-aware” basically (or rather - support being called in the special use-case of receiving input externally and sending it externally) - it makes it harder to interact with, vs. when it’s being run in the same VM as the flow obviously, in an ideal world we’d just have a NodeJS server that executes this logic but that’s not exactly where we’re at and not the path we took (for various historical reasons) today this NodeJS script is being invoked by a k8s internal cron scheduler (we wanna migrate it to Prefect anyway) - pulls stuff from an SQS queue - and then runs on those this is the “Routine” case, in the non-routine cases, we have a very large batch of let’s say 50K, 500K inputs etc, that we wish to process ASAP (rather than a “drizzle” through SQS) so for that we (until today) manually ran the NodeJS script using a bash wrapper that wired in an input file and an output_file_path -- and then we literally ran those (in a loop on EC2) until done, and then we SCP’d all the CSVs to a local PC and from there pushed to S3 obviously - very cumbersome, very error-prone, no visibility, monitoring, etc. this is exactly why we’re trying to migrate it to prefect
a

Anna Geller

07/10/2022, 6:40 PM
And regarding your Dockerfile, I don't know what's your goal with this Dockerfile is, but for Prefect you need an image that is based on Prefect Docker image and has Prefect entrypoint that starts a flow run, you're currently overriding that which won't work
t

Tom Klein

07/10/2022, 6:40 PM
wdym “won’t work”? it does work - it runs the flow perfectly. it only fails when we try to run it with a dask task executor. If we use a
LocalDaskExecutor
for example it works fantastic (ally?)
this is exactly what i’m asking: what about the prefect image are we missing? which dependency? in the end it’s not magic, it’s just software - so something must be different --- and different in a very unique way to only make
DaskExecutor
not work correctly but everything else work fine (and even with that, it’s not that it fails on like a missing package, it runs — it just strangely fails on something, tries to fallback on
kubectl
, and fails that too — probably as a completely irrelevant side-effect that hides the true issue)
in your docs it says nothing about needing to rely on the Prefect docker image, in fact, it literally says that the only required dependency is
prefect
(and that it needs to be in the PATH) . that’s it
it even literally says we don’t have to rely on it
a

Anna Geller

07/10/2022, 6:46 PM
This thread is already 70-messages long with no resolution in sight, which suggests that something is not right - either your problem description is not precise enough or you're not clear about what's the actual question you want to ask. Slack is more for ad-hoc questions rather than 70-100 messages long discussions I'd suggest creating a GitHub issue with a summary of what is a problem you're trying to solve. Otherwise, it's not clear what is even a problem here and where we can consider this issue as resolved
👍 1
for now, I'm marking this thread as done and suggest creating a GitHub issue with a problem description if you still need help, you may share the link to the issue here Thanks a lot for understanding 🙌
t

Tom Klein

07/10/2022, 6:48 PM
np, ill create an issue - thanks
👍 1
4 Views