Thread
#prefect-community
    Emma Rizzi

    Emma Rizzi

    1 year ago
    Hello, I'm trying to use Prefect Server deployed on a AWS instance with a ECS agent running on a ECS cluster. I created a small hello world task to print 'hello' in the logger, which runs correctly when I create a Local agent on the same instance where the server is deployed, but I can't manage to get the cluster working... Heres what happens : • I created a ECS Agent running on my instance with Prefect Server, and configured with my cluster arn • i register my flow and quick start on the UI, on AWS console I can see the task begin created, with status going from Pending to Running to Stopped. Although the task definition contains the tag [INACTIVE] which seems weird 🤔 • On prefect UI the task is stuck at the "Submitted" state and nothing is displayed in the logs I tried to follow this and this and added task_role_arn and execution_role_arn with lots of permissions but it doesnt seem to be the problem. I also tried to add GCS as storage option, no success. I have no idea of what's causing this issue so here I am hoping someone knows 🙂
    Jenny

    Jenny

    1 year ago
    Hi Emma! What kind of storage are you using? (Background to my question in the thread here!)
    Oh sorry - just saw you mention GCS storage. Have you tried S3 Storage?
    Emma Rizzi

    Emma Rizzi

    1 year ago
    Hi ! I first tried with no storage configuration, then with a GCS bucket. With the bucket storage I can see the files being created successfully. I will try the S3 storage then!
    So I just tried using S3 storage and the flow run is still stuck on Submitted, even thought I saw the task being created on my ECS cluster
    Jenny

    Jenny

    1 year ago
    Let me check this one with the team for you Emma.
    Hi Emma - do you know if your ECS tasks can talk to your server API?
    Ben Muller

    Ben Muller

    1 year ago
    Hi @Emma Rizzi I made a poc for this exact stack (even following those exact tutorials!). I just made the repo public here It is super rough, I havent filled out the read me or anything. It is built with aws cdk, so if you know your way around that you can deploy it with
    cdk deploy
    and once you register all the other things from the tutorial you should have a stack that is working.
    btw for reference a flow I registered that works too
    import prefect
    from prefect.storage import S3, Docker
    from prefect.run_configs import ECSRun
    from prefect import task, Flow, Parameter
    
    
    RUN_CONFIG = ECSRun(
        run_task_kwargs={
            "cluster": "prefect-agent-cluster",
            "networkConfiguration":{'awsvpcConfiguration': {'assignPublicIp': 'ENABLED', 'subnets': ['subnet-0a8725c8a7e43f884', 'subnet-0dc09bddb5e1274e0'], 'securityGroups': []}}
        },
        labels=['s3-flow-storage'],
        task_role_arn="arn:aws:iam::***:role/ECSTaskS3ECRRole",
        execution_role_arn='arn:aws:iam::****:role/ECSFullyLoaded',
        # image='prefecthq/prefect:latest-python3.8',
        # memory=512,
        # cpu=256
        )
    
    # STORAGE = Docker(
    #     registry_url='****.<http://dkr.ecr.ap-southeast-2.amazonaws.com/|dkr.ecr.ap-southeast-2.amazonaws.com/>', 
    #     image_name='prefect-tutorial', 
    #     image_tag='latest',
    #     dockerfile='./Dockerfile'
    #     )
    
    STORAGE = S3(
        bucket="prefect-tutorial",
        stored_as_script=True,
        local_script_path='./ecs_flow.py'
        )
    
    
    @task
    def say_something(words):
        logger = prefect.context.get("logger")
        <http://logger.info|logger.info>(f'I printed {words}')
    
    
    
    with Flow(
        "fargate-task-please", 
        storage=STORAGE,
        run_config=RUN_CONFIG
        ) as flow:
        
        message = Parameter('message', default='default')
    
        say_something(words=message)
    
    flow.register(project_name="aws-flow")
    Emma Rizzi

    Emma Rizzi

    1 year ago
    @Jenny I allowed all in and out connections on the instance hosting prefect server (not very secure but just to test 🤫 ), so I guess the communication is OK although I will double check if tasks are allowed to open connections I'll look into it next week thanks a lot @Ben Muller I'm almost sure the issue is caused by an incorrect configuration somewhere but I still need to see where it's happening. Thanks for your responses !
    Hi again, I think I found the culprit but not 100% sure its the only issue. I used the
    prefecthq/prefect:latest
    image for my flow, and see that
    PREFECT__CLOUD__API
    is set to localhost in environment variables on AWS console. As i deployed Prefect server on another instance than the one running the tasks, I guess it breaks the communication. I tried to set it in the env parameter of ECSRun but it seems to get overriden to localhost afterward. Do you know if I have to make a custom image with the correct Prefect adress, or do i have to change it somewhere else ?
    Update ! The issue is solved I had to add
    [cloud]
            api = "<http://XXX:4200>"
    To the config file, I had to look at the source code to find where and how
    PREFECT__CLOUD__API
    is overriden. I think its worth adding this to the docs, as its required to run Prefect Server with other agents than Local and it doesn't seem obvious to look at cloud config when running server 🙂
    Jenny

    Jenny

    1 year ago
    Hi @Emma Rizzi - apologies for not replying sooner. This got buried in the thread. I'm glad you got it working but I'm surprised you had to change a cloud setting for a Prefect server deployment. Are you sure the backend where you run your agent is set to server?
    Emma Rizzi

    Emma Rizzi

    1 year ago
    No problem ! I ran
    prefect backend server
    so I guess it is set properly. I find it very strange too but i found this line that suggested me to use cloud.api to configure the backend address
    I have another issue with the storage now, as it's getting hard to configure the TLS config with DockerStorage as our registry is not properly administrated yet (another part of my team is handling this) a few questions: • Is there a way to use Docker storage without pushing the image to a repository ? • Can't we authentify to docker repository with login/password instead of TLS Config ? • Is there a way to specifiy the files to upload when using S3/GCS ? I first tried theses options but I got many file not found errors due to only the flow file being uploaded. I switch to docker storage because I saw it had the 'files' option.
    Jenny

    Jenny

    1 year ago
    Hi @Emma Rizzi - could you post your new question into the Server channel so that the rest of the team can see and respond to it?
    Emma Rizzi

    Emma Rizzi

    1 year ago
    I posted it this morning but updating it with the new questions from today investigations. Thanks for your help 🙂
    Jenny

    Jenny

    1 year ago
    Thanks Emma. Checked with the team and that's expected but I agree we should update that in the docs. @Marvin open “Update ECS agent docs for setting custom backend”
    Marvin

    Marvin

    1 year ago
    Emma Rizzi

    Emma Rizzi

    1 year ago
    Thank you! I hope to contribute when I get it all working 🙂