Hi, This is kind of difficult to explain. I have a flow that performs a job over an EC2 Instance, and the problem is that sometimes it fails and the most of times, it succeed. The job that runs, do the following operations:
1- Download a jar from somewhere.
2- Download some script from github.
3- Execute the script with some sort of arguments. And what it does, is download data from AWS S3, process the files downloaded, and then upload the results.
The errors that I usually got are like
•
Main class from jar could not be found
•
The output directory does not exists, so it could not be synchronize to AWS S3
Do you have any idea of why these types of issues happens time to time?
BTW: I am using
boto3
SSM
agent to handle EC2 Instances creation, execution and termination
a
Anna Geller
11/25/2022, 8:03 PM
why would you create and shut down instances? we have an ECSTask infra block that makes it much easier - if you want to give it a try, check https://github.com/anna-geller/dataflow-ops
Anna Geller
11/25/2022, 8:03 PM
you can either use GitHub or S3 block for code storage, both will work well with ECS
Anna Geller
11/25/2022, 8:04 PM
I deleted the same Q you asked in prefect-cloud
please don't cross-post, someone will get back to you as soon as we can
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.