Hey all I ve been struggling a bit and have some decently lo Prefect Community #prefect-server

Hey all - I've been struggling a bit, and have som...

Greg Desmarais

07/20/2020, 3:40 PM

Hey all - I've been struggling a bit, and have some decently long threads. I'll try a question that is hopefully to the point. I have a local bit of code that creates a flow for a fargate environment. I can submit that flow to the prefect server (on an EC2), and I have an agent that picks up that flow (after a manual run in the UI). The agent ends up hanging after

Completed flow run submission

. No Fargate cluster is ever created. What needs to happen on the agent (or prefect server?) to see that the next steps are takan (e.g. cluster created, work started)?

Greg Desmarais

07/20/2020, 3:41 PM

For example, a snippet of client driver code:

Copy code

executor = DaskExecutor(cluster_class='dask_cloudprovider.FargateCluster',
                        cluster_kwargs=cluster_kwargs)
flow.environment = FargateTaskEnvironment(
    executor=executor,
    region_name=DEFAULT_REGION,
    **task_definition_kwargs
)
# flow.environment = LocalEnvironment(executor=executor, labels=labels)

flow_id = flow.register(labels=labels)

Greg Desmarais

07/20/2020, 3:42 PM

The agent is running on EC2 (along side the prefect server), and the EC2 has a role assigned with full ECS permissions.

Kyle Moon-Wright

07/20/2020, 3:53 PM

Hey @Greg Desmarais, I'm not too seasoned with Fargate, but running the agent with

--verbose

should give you some info as the agent starts up, such as what labels it's running with and which endpoint it's querying.

Greg Desmarais

07/20/2020, 3:58 PM

I have it running with -v (assuming same as --verbose). I do see some extra info, and the tags seem right as the agent picks up the flow. It just never does anything with it.

Greg Desmarais

07/20/2020, 4:01 PM

But I very much appreciate the suggestion @Kyle Moon-Wright - I'm not getting much help on this, so am wandering in the dark.

Kyle Moon-Wright

07/20/2020, 4:18 PM

Yeah, of course. This may be an issue with your FargateTaskEnvironment configuration. Do the logs / state in the UI reveal anything (or maybe in Fargate itself)? The FargateAgent should be able to handle work with a DaskExecutor without that environment, so I wonder if you can use those configurations with your Agent instead.

Greg Desmarais

07/20/2020, 4:22 PM

That is an interesting suggestion. I'm actually a bit down that path right now. I'd like to be able to create my environment uniquely for each flow created, but I might have to have a static set associated with the agent.

Greg Desmarais

07/20/2020, 4:27 PM

I haven't found anything insightful in the ui logs - but I did notice in Fargate that it was trying to operate in the `default`cluster, even though my executor has a different

cluster_name_template

. I'm finding two things very frustrating: 1. the serialization process when pushing a flow to the prefect server strips out a lot of parameters that are set on the environment/executor - seemingly arbitrarily. 2. the process of running the flow filters provided kwargs very aggressively, for example ignoring container definition arguments like memory or image name

Greg Desmarais

07/20/2020, 4:27 PM

I know this isn't on you - I'm just wondering if I've gone down the right path with prefect or should I jump now to something like airflow.

Chris White

07/20/2020, 4:38 PM

Hi Greg - we generally don’t recommend people jump to Fargate deployments unless they’re intimately familiar with Fargate already; perhaps you would be better off starting with a Docker agent

Greg Desmarais

07/20/2020, 4:40 PM

I'm pretty familiar with Fargate - do you see a newbie mistake I'm making?

Chris White

07/20/2020, 4:40 PM

The Fargate agent is almost entirely community maintained (I have personally never used Fargate), so we’d welcome any github issues (or PRs!) for improvements

Greg Desmarais

07/20/2020, 4:41 PM

crap

Greg Desmarais

07/20/2020, 4:42 PM

With complete honesty - do you think I should bail on fargate altogether w/rt prefect? Until more maturity?

Greg Desmarais

07/20/2020, 4:43 PM

I'm asking as someone with a team of compbio scientists looking for a good remote dask and workflow solution.

Greg Desmarais

07/20/2020, 4:45 PM

It can be raw/rough, but I have to be able to make it work. Right now, I'm dealing with a lot of black boxes - things like many layers of tools for serialization and the like. If it came down to a fargate usage issue, I think I could handle that. I'm just getting hung up in all the steps between creating a flow and having it kick off.

Greg Desmarais

07/20/2020, 4:46 PM

If you had to choose a setup to run a flow with dask executors on a cluster, what would you choose? Would you even look at ECS with prefect, or is the k8s support so much better that you'd choose that?

Chris White

07/20/2020, 4:46 PM

I can’t say definitively - the errors you are describing are Fargate issues, not Prefect issues. But yea, if you aren’t able to debug them then I definitely recommend switching to another agent (the Kubernetes Agent is probably the most popular, and the Docker Agent is probably the simplest). Unless you deeply require something that Fargate requires you shouldn’t notice a difference — I would personally choose a K8s agent

Chris White

07/20/2020, 4:47 PM

ECS would work well with a Docker agent but might be harder to create large multi-machine dask clusters from

Greg Desmarais

07/20/2020, 4:47 PM

I absolutely require large, multi machine clusters.

Greg Desmarais

07/20/2020, 4:47 PM

I didn't choose the docker agent because of that...

Chris White

07/20/2020, 4:47 PM

Gotcha, then yea the K8s agent is probably the way to go

Greg Desmarais

07/20/2020, 4:48 PM

At any given time I may have a dozen of the bigest AWS machines running gpus at once.

Greg Desmarais

07/20/2020, 4:48 PM

crap again. I don't know k8s, and that gives me another tech to learn.

Greg Desmarais

07/20/2020, 4:49 PM

I'm not griping about the prefect team work - I know I'm in a 0.x product, in a configuration that is not a first class citizen.

Chris White

07/20/2020, 4:51 PM

Well, I don’t think this is related to any sort of lack of support or being 0.x - a large portion of our clients use the fargate agent and we haven’t had to dip too deeply into fargate ourselves because things work fine. If I had to guess, your situation sounds like you haven’t created a network from your fargate environment back to the API that you are hosting, so the fargate task that the Agent creates can’t communicate back to your API. This is generally a situation where using Prefect Cloud is convenient, because you don’t need to worry about networking complexities

Greg Desmarais

07/20/2020, 4:55 PM

🤞 Any chance someone has a working example of the usage of executors/environments for that? 🤞

Greg Desmarais

07/20/2020, 4:55 PM

But it does give me some hope - I'd rather be wrong and hav eit possible than the other way around.

Chris White

07/20/2020, 4:59 PM

Looks like there was another user who got Fargate + Server working: https://prefect-community.slack.com/archives/CL09KU1K7/p1594715268088300?thread_ts=1594715268.088300

Greg Desmarais

07/20/2020, 5:01 PM

Thank you for that - his situation is def. similar, and I've been trying to follow his pattern. Not sure where I'm screwing up.

👍 1

3 Views

Open in Slack

Previous Next