Hi guys, I was working on a Dask deploy on EKS whe...
# ask-community
l
Hi guys, I was working on a Dask deploy on EKS when the
DaskKubernetesEnvironment
was supported and that worked great, now Im using the
run_configs
and
executors
and Im having the botocore region issue displayed, not sure Im missing any argument for the
DaskExecutor
please check the code bellow:
Copy code
flow.storage = Docker(registry_url=ecr,
                      python_dependencies=python_dependencies,
                      files={f'{getcwd()}/src/dask_flow': '/modules/dask_flow'},
                              env_vars={"PYTHONPATH": "$PYTHONPATH:modules/"},
                              image_tag='latest')

flow.run_config=KubernetesRun()
flow.executor=DaskExecutor(
   cluster_class="dask_cloudprovider.aws.FargateCluster",
   cluster_kwargs={'n_workers': 5, 'region_name':aws_region})
the errors displayed is: `
Copy code
NoRegionError: You must specify a region.
k
Hi @Leandro Mana, are you getting this when you register? or when you run the flow?
l
Hi Kevin, Im getting that when the flow is running
I even played with this:
Copy code
flow.run_config=KubernetesRun(env={
            'AWS_DEFAULT_REGION':aws_region,
            'AWS_SECRET_ACCESS_KEY': environ.get('AWS_SECRET_ACCESS_KEY'),
            'AWS_ACCESS_KEY_ID': environ.get('AWS_ACCESS_KEY_ID'),
            'AWS_SESSION_TOKEN': environ.get('AWS_SESSION_TOKEN')})
        flow.executor=DaskExecutor(
            cluster_class="dask_cloudprovider.aws.FargateCluster",
            cluster_kwargs={'n_workers': 5})
and see that is getting the region and credentials from the
KubernetesRun()
k
Are you using the ECS agent?
l
no this is on EKS
the idea is to setup Flows in a ServerLess way, EKS on fargate
previously with the
DaskKubernetesEnvironment
worked perfectly
but now using the
run_config
and
executor
I dont find the way... flows are properly registered but when deployed into the EKS fails.
So the above one, by setting all those ENV vars is also failing, in a different way. So to try to simplify the question, how can I use the
DaskExecutor
to run on EKS.
as the
cluster_class
is using dask_cloudprovider and EKS is not mentioned there https://cloudprovider.dask.org/en/latest/aws.html#fargate
k
Looking into this
l
Thanks for the help ; )
also not sure if helps, in Prefect docs is
cluster_class="dask_cloudprovider.FargateCluster"
but I had to install that module via pip so in my code is
cluster_class="dask_cloudprovider.aws.FargateCluster"
so there is a
.aws
as without installing via pip it was complaining at health check that there is no
dask_cloudprovider
module
k
Read the code and it looks like the right approach to me. Will probably have to respond to you tom. I'll find someone who knows more than me.
l
no probs, thanks for taking the time on checking into this ; )
@Kevin Kho Hi Mate, did u have a chance to check further?
k
Yes I have! One sec I have a response for you
l
great ; )
k
The issue is that the AWS_... environment variables are only being set in the flow runner pod (the initial job process) and not in all the dask workers, so any task/result that needs access to AWS will fail. You could fix this by also specifying the 
env
 variable in the 
cluster_kwargs
 to 
DaskExecutor
, or you could setup EKS so that AWS IAM roles are automatically forwarded to pods (you need to add a service account to the pod for that to work). Lots of options. If the only goal is “scale dask out on AWS” though, then using dask-cloudprovider with 
FargateCluster
 would be a simpler story, as kubernetes adds additional complications.
l
ok will try the env setup on each pod... the weird thing is that this worked perfectly when was
DaskKubernetesEnvironment
the reason of using EKS on Fargate is because it allows to do ServerLess pods
if I do this with ECS on fargate will be the same?
Also I think that when I added the AWS_ env vars into the
DaskExecutor
the error was like no finding
prefect
module... which is even weird : ( will confirm on this shortly.
How is to setup the
env
var into the
cluster_kwargs
 to 
DaskExecutor
? as that class does not have
env
keyword, it needs to be as an string?
k
So Fargate inherits ECSCluster which has environment as an argument which takes in a dictionary. This is the way to pass env variables. So it would be
DaskExecutor(cluster_kwargs={enviroment: {AWS_CREDS_AND_REGION}})
Fargate inherits ECS for dask_cloudprovider so I think it will be the same
No prefect may be a matter of passing the image to kubernetes run
l
great, will test all this and let u know how it works