https://prefect.io logo
Title
l

Luis Muniz

09/02/2020, 1:40 PM
Hi Prefect Community! I think I saw in one of my deep dives an example of a ResourceManager that manages an EMR cluster to submit spark jobs. I have been searching for a good 2 hours now and can always only find managing Dask clusters via a resource manager. Does anyone have a pointer? Very much appreciated. 🙂
j

josh

09/02/2020, 2:20 PM
Hi @Luis Muniz I don’t have much knowledge in EMR but from a prefect resource manager perspective I could image in would look something like this:
@resource_manager(name="spark")
class SparkCluster:
    def __init__(self, ...):
        self.... = ...

    def setup(self):
        return SparkClient(self...)

    def cleanup(self, client):
        client.close()
And that
SparkClient
would be whichever library you use for managing EMR connections
b

Ben Davison

09/02/2020, 2:27 PM

https://youtu.be/LT1qe8cCJEAisâ–¾

this the video you are thinking of?
:upvote: 1
l

Luis Muniz

09/02/2020, 3:03 PM
YES!
thanks @Ben Davison and @josh
I guess i'll have to come up with the EMR provisioning stuff on my own, but this is a good starting point