Regarding the architecture,
this may help
I think with Prefect 2.0 there is so much flexibility that all of the options you mentioned may work, depending on how you tackle them
We don't have any clear recommendations yet, but I understand why this would be valuable and we will work on that in a near future
Running your Prefect 2.0 agent on GKE is quite common since it allows for a lot of flexibility
But if you want to have a recommendation from a deep learning practitioner perspective, I would go with a single but powerful VM with enough GPU/memory resources etc. because of the simplicity and how easy you can SSH to the instance, inspect results/files generated by your model training process etc. Kubernetes is great but has complexity that as a data scientists you shouldn't have to worry about in all honesty