Hi Prefect Community! I think I saw in one of my ...
# prefect-community
l
Hi Prefect Community! I think I saw in one of my deep dives an example of a ResourceManager that manages an EMR cluster to submit spark jobs. I have been searching for a good 2 hours now and can always only find managing Dask clusters via a resource manager. Does anyone have a pointer? Very much appreciated. 🙂
j
Hi @Luis Muniz I don’t have much knowledge in EMR but from a prefect resource manager perspective I could image in would look something like this:
Copy code
@resource_manager(name="spark")
class SparkCluster:
    def __init__(self, ...):
        self.... = ...

    def setup(self):
        return SparkClient(self...)

    def cleanup(self, client):
        client.close()
And that
SparkClient
would be whichever library you use for managing EMR connections
b

https://youtu.be/LT1qe8cCJEAis

this the video you are thinking of?
upvote 1
l
YES!
thanks @Ben Davison and @josh
I guess i'll have to come up with the EMR provisioning stuff on my own, but this is a good starting point