Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    had a question about prefect(1.0 in this case): it isn’t super clear from reading the documentation where different aspects of the system “live”… it seems like, assuming Prefect Cloud usage, you have the cloud, and then some agents. However the documentation jumps into showing CLI-esque commands and doesn’t provide much clarity around where that lives
    Kevin Kho

    Kevin Kho

    4 months ago
    Prefect Cloud does not host any compute so it just scheduled and monitors work that happens on your infrastructure. So if you use the local agent, Prefect Cloud will trigger a flow run, the agent will see it, and then execute it on the same machine
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    sorry, got pulled into meeting before I could respond, ty @Kevin Kho! So when it references commands like
    $ prefect foo
    , where does it assume that is being run from?
    $ prefect backend cloud
    what are you defining a backend “for”?
    Kevin Kho

    Kevin Kho

    4 months ago
    Prefect Cloud is our hosted version of Prefect. Prefect Server is if you host Prefect (the scheduler, api, database, etc) on your own infrastructure. The default is Cloud. Cloud has 20k free task runs every month which is more than enough to get started with
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    where are you running that command from though?
    $ prefect backend cloud
    Kevin Kho

    Kevin Kho

    4 months ago
    From your local terminal, to point to the Cloud to use that API, but you shouldn’t even need to because it’s the default
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    ok so even with cloud provisioning, agents, etc… there is always a notion of needing a CLI interaction to instantiate some basic functionality?
    Kevin Kho

    Kevin Kho

    4 months ago
    The question is a bit hard to answer because you can use the CLI but you can also do it without. I think what would be most helpful is to go through this section of the tutorial where you register and run a Flow against Prefect Cloud. You just need to reach the “Deploy a Flow” part
    If that was the part that confused you, yes just do
    prefect backend cloud
    , but it’s really the default so no need, and then you can proceed. I think the needed CLI commands are are completely unavoidable are:
    prefect auth login --key API_KEY
    and the agent:
    prefect agent local start
    to start
    But once those are there, you can register and run without using the CLI. Why is the focus of the question if CLI is needed? Do you have some kind of environment without access to CLI like Jupyter?
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    we’ve got a data team testing out Prefect for some analytical work and I’m(a DevOps engineer) trying to see if I can make some sense out of the documentation about how this might be automated, put in Infrastructure-as-Code etc…
    I mean, we can’t be the only ones who want to deploy meaningful infra using TF or similar right?
    Kevin Kho

    Kevin Kho

    4 months ago
    Actually, we have a TF recipe for the agent on AWS. Would that help?
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    lol that’s the part I need the least help with 😆
    it just isn’t very clear from the docs about how much the CLI is actually needed
    or interactive(user) interaction at all
    like starting an agent… is that only done once? within the context of flows?
    we’d prefer as with most infrastructure we use that all provisioning, configuration, start-up etc… is handled by automation
    this all seems like very much an afterthought in the documentation
    Kevin Kho

    Kevin Kho

    4 months ago
    Ah ok. When you outline it that way though. Your responsibility is likely the spinning up of agent and infrastructure. The Data team’s responsibility is writing the Flow and logic. Once they register, Prefect Cloud will schedule the Flows and then the running agents can find it and execute it. So the workflow is something like this: Data team writes a flow. They test with
    flow.run()
    . When ready, they register the Flow against Cloud. In order to do so, they need to be authenticated (
    prefect auth login
    ). The registration can use the Python
    flow.register()
    call or can use the CLI with
    prefect register …
    . It’s up to them at that point and same outcome for the most part. On the DevOps side you need to provision the agents to run these on. Let me show you the ECS Agent template. It authenticates using an environment variable for the API token, and the
    containerDefinitions
    takes a command to start an agent. So spinning the agent can be done in one command. Does that help?
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    ok, so the creation of the ecs task via AWS API call functions as the
    prefect agent start
    correct?
    Kevin Kho

    Kevin Kho

    4 months ago
    ECS Task meaning to contain the Flow or the agent?
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    agent
    starting an ECS task with that template will instantiate the agent and register it with the Prefect cloud correct?
    Kevin Kho

    Kevin Kho

    4 months ago
    Yes exactly. You would run the
    aws ecs create-service
    which makes an ECS Service with
    prefect agent ecs start
    as the command of the container, which creates a container for the agent
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    perfect!
    Kevin Kho

    Kevin Kho

    4 months ago
    In case it helps you, Terraform templates are here
    Which basically just runs the agent start command and passes the API KEY like this (not that I know Terraform)
    The agent then pings Prefect Cloud every 10 seconds. If it finds a flow to run, it loads the metadata needed, and then executes it. So the execution depends on agent type; • Local Agent - run Flow on same machine • Docker Agent - run Flow in container on same machine • Kubernetes Agent - run Flow in k8s pod • ECS Agent - run Flow as ECS Task
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    ohhhk
    that makes more sense, ty
    Kevin Kho

    Kevin Kho

    4 months ago
    Of course!
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    I could see there being more homegrown automation around CI/CD if you wanted to handle flows and flow code totally hands off
    that would probably require provisioning a separate compute workload of some kind to handle the “CLI” part of it
    but that’s beyond the scope of what we just discussed I think
    Kevin Kho

    Kevin Kho

    4 months ago
    So as an example, if it’s the registration portion, that would be covered in the CI/CD like Github Actions where you can run CLI commands on events (merge to master for example). And then you can run
    prefect register
    . See this for an idea.
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    🎉
    Kevin Kho

    Kevin Kho

    4 months ago
    For automated provisioning of infra…you probably know more than me but one Kubernetes agent is enough to just submit jobs to a cluster and then rely on auto-scaling. So a lot of users are on k8s
    Mike Vanbuskirk

    Mike Vanbuskirk

    4 months ago
    thank you, this is awesome!