<@ULVA73B9P> I have a prefect server running on a ...
# ask-marvin
c
@Marvin I have a prefect server running on a separate machine connected to my local network. I've already changed
PREFECT_SERVER_API_HOST
on the separate machine to it's LAN address so that I can connect to the server through it's UI on my computer. Now what other settings do I need to change for the prefect server on my separate machine, and the prefect client on my computer, so that when I
prefect --no-prompt deploy --all
client side it will take the deployment code from my computer referenced in the yaml, send it to the work pool on my separate machine which is referenced as a subprocess on the same machine as the prefect server, and instantiate the deployment?
m
To set up your Prefect client and server for deploying flows to a work pool on your separate machine, you will need to follow these general steps: 1. Configure the Prefect server: Ensure that the server is set up with the correct
PREFECT_SERVER_API_HOST
to be accessible over the network. 2. Prepare your client-side Prefect environment: Make sure to set the
PREFECT_SERVER_API_HOST
on your client to point to the LAN address of the Prefect server on the separate machine. 3. Create a work pool on the server: You can create a work pool through the UI or using the Prefect CLI. Ensure the work pool is associated with the correct infrastructure where you intend to run your deployments. Work pools documentation. 4. Use Prefect deployments: Prefect uses the concept of deployments, which combines your flow's code with a set of configuration settings. Deployments are defined in YAML and can be created through the Prefect CLI. 5. Build your deployment: Before deploying, build the deployment definition file
deployment.yaml
by using the Prefect CLI's
prefect deployment build
command from the folder containing your flow script and any dependencies of the script. 6. Deploy using the CLI: After building the deployment definition file, you can create the deployment on the API using
prefect deployment create deployment.yaml
. Define the
work_pool_name
in your deployment YAML or when using the deployment create command to link to the work pool on your separate machine. Deployments guide. 7. Start a worker to process tasks: Start a worker on your separate machine that subscribes to the work pool. To do this, you can run
prefect worker start -p POOL_NAME
on your separate machine, where
POOL_NAME
is the name of the work pool you have created. Workers are lightweight polling services that will pick up task runs from the work pool. 8. Run your deployment: Finally, you can invoke the deployment with
prefect --no-prompt deploy --all
if the deployment specification is correctly set up to reference your subprocess-based execution environment within the Prefect server's orchestrations. Here is a basic example of what deployment code may look like:
Copy code
python
import time
from prefect import flow, task
from pydantic import BaseModel, Field

class Data(BaseModel):
    key: str
    value: dict = Field(default_factory=dict)

@task
def fetch_data_for_key(key: str) -> Data:
    """A task that simulates fetching data from some source"""
    return Data(key=key, value={"data": "some data"})

@task
def process_data(data: Data) -> Data:
    """A task that simulates processing data"""
    print(f"Processing {data!r} (I swear)")
    time.sleep(3)
    return data

@flow
def work_i_need_to_do_sometimes(keys: list[str]):
    """Work you might need to do ever so often"""
    raw_data_futures = fetch_data_for_key.map(keys)
    process_data.map(raw_data_futures)

if __name__ == "__main__":
    work_i_need_to_do_sometimes.serve(
        name="sporadic-deployment",
        parameters={"keys": ["foo", "bar", "baz"]},
    )
View Python Source Example For additional details, you may also want to refer to the block-based deployments guide: Block-Based Deployments.
c
@Marvin These are the other settings that specify an address besides PREFECT_SERVER_API_HOST which didn't seem to work for the prefect client on my local machine: PREFECT_API_URL='None' (from defaults) PREFECT_RUNNER_SERVER_HOST='localhost' (from defaults) PREFECT_UI_API_URL='None' (from defaults) PREFECT_UI_URL='None' (from defaults) PREFECT_WORKER_WEBSERVER_HOST='0.0.0.0' (from defaults) Which of these should I change on my own computer, and on the separate machine so that prefect on my computer is fully integrated with the prefect server and environment on my separate machine (it has LAN 10.88.13.29)?
m
To integrate your local Prefect client with the Prefect server and environment on your separate machine with LAN address 10.88.13.29, you'll need to adjust the following environment variables accordingly on your local machine: 1. PREFECT_API_URL: This should be set to point to the API of your Prefect Server instance. For example:
Copy code
export PREFECT_API_URL='<http://10.88.13.29/api>'
Settings Guide The other variables you mentioned are predominantly for server and UI configuration purposes, and typically do not need to be altered on your client machine for it to integrate with the server. However, you should verify the following on your server machine: 1. PREFECT_RUNNER_SERVER_HOST: Default is 'localhost', which is fine if your flows will run on the same machine as the server. No need to change this unless you plan to have runners elsewhere. 2. PREFECT_UI_API_URL: If the UI is being served on a different machine or port than the API, you would set this to point to the API URL. 3. PREFECT_UI_URL: Set this to the externally accessible URL of the server's UI if it's different from defaults. 4. PREFECT_WORKER_WEBSERVER_HOST: The default '0.0.0.0' allows the work pool web server to be accessible on any network interface, which should be fine for your setup. Ensure that you've also configured your network to allow connections between your local machine and the separate machine on the required ports, typically TCP port 4200 for the API by default. If you're utilizing Docker, Kubernetes or other container orchestration tools, make sure that the network and port configurations align with these settings. After setting these environment variables, restart any Prefect processes so that they recognize the changes. This should set up your local client to deploy to, and interact with, the Prefect server and its environment on your separate machine.
c
@Marvin So that means I can't locally reference my flow code on the client, and have it deploy on the remote server? I'll either have to move the flow code to the server first and reference it locally there or specify a remote location in the yaml?
m
It appears that there might be some confusion around how flow code is managed in Prefect deployments. Essentially, for a flow to be executed in a remote environment, the flow code needs to be accessible to the execution environment. You have a couple of options for how to handle this: 1. Local Storage: This is usually for experimentation with a local subprocess work pool. For local development, you could deploy a flow from the directory containing your flow code using
prefect deploy
. However, this approach is mainly for deploying to the same environment where the code resides and is not suitable for remote deployments. 2. Remote Storage: For a flow to be executed remotely, the flow code should be stored in a place that's accessible by the remote infrastructure. This could mean pushing your code to a remote git repository, a cloud storage bucket (such as S3, GCS, Azure Blob Storage), or a Docker image registry if you're containerizing your flows. The flow storage concept in Prefect is separate from the Prefect server or Prefect Cloud service, so the execution environment must have access to the flow code by some means. For remote deployments, it typically means referencing a remote code location within your deployment YAML file or specifying how the code should be retrieved when you create the deployment. For example, if your flow code is in a public GitHub repository, you might specify the repository URL and path to your code within the
prefect.yaml
file. If your code is in a private repository, you'd also include credentials that allow the remote execution environment to fetch the code. Ultimately, the actual code for a flow isn't stored within Prefect's database or passed around by the Prefect server. It is the responsibility of the user to ensure that the execution environment can access the flow code, which may involve setting up additional storage mechanisms or infrastructure configurations. For more details on where to store your flow code, please refer to the Storage Options Guide.
c
@Marvin Ok then what about push and pull in the prefect.yaml file? Is that only for handling remote storage from the local machine? Can I specify a push to the remote location on my separate machine from my computer, and then for the pull reference the local location on the separate machine after the push moved the project there?
m
The
push
and
pull
parts of the
prefect.yaml
file are indeed related to handling the flow code’s storage and retrieval when dealing with remote environments. 1. The push step: This is used in scenarios where your code isn’t stored persistently or is outside version control. The push step will generally involve specifying how and where to push your flow code—such as building a Docker image containing your flow and pushing it to a Docker registry, or uploading your flow code to a cloud storage service like AWS S3, GCS, or Azure Blob Storage. 2. The pull step: This would be present in your deployment configuration, telling Prefect how to retrieve the flow code when it's time to execute the flow. The pull step is used to define the method for getting the flow code into the execution environment, whether that’s pulling a Docker image or fetching the code from cloud storage. Regarding your specific case, you can indeed specify a push in your
prefect.yaml
to move your code to a remote location from your local machine. After that, for the deployment on the separate remote machine, you would specify a pull referencing the local location where the push operation previously transferred the flow code. Here's an example scenario: - Push: You define a push step in your
prefect.yaml
to upload your flow code to cloud storage or a Docker registry from your local machine. - Pull: In the deployment configuration on the remote machine, you reference the cloud storage path or Docker image in the pull step so that when a flow run is about to start, the execution environment knows where to fetch the necessary flow code from. It’s essential, however, to understand that Prefect does not automatically push code changes to git or any other platform; any code changes must be pushed manually or via a CI/CD pipeline. Additionally, if you are using CI/CD to build Docker images outside of Prefect, you might not need explicit push or pull actions in the deployment configuration, as the image can be referenced directly by the deployment. On a practical note, the
prefect.yaml
sequence of push/pull actions must align with your chosen infrastructure and the way your code is managed (i.e., via git, cloud storage, Docker images, etc.). For examples on using the push and pull steps in
prefect.yaml
and further explanation of storage options, visit the Storage Options Guide and Prefect Deployment Guide.
n
on the client side you shouldnt need to set the PREFECT_SERVER_API_HOST, only PREFECT_API_URL
c
If I set PREFECT_API_URL on the client side would that let me run code that is on the client machine without a remote reference or would I still need to move the code over to the server? Right now I just push the code to a git repo on my server and run it there from the UI.
n
the prefect server shouldn't need the source code. when you run
prefect deploy
you're storing a refernce to where the source code lives in the Deployment (stored on the server), so that when when a flow run from that Deployment is scheduled later, you know what the entrypoint of that flow is relative to whatever your
pull
step clones down
c
My work pool process is also on the server, so does that mean the pull will have to reference a path that links remote access to my work computer, or I'd have to move my code to the server so the pull can reference a local file storage location?
Maybe this is obvious but just making sure
ok yeah so the deployment would be on my server, could I keep it on my computer and move it over only to run on the server?
n
sorry, when you say server what do you mean? generally I mean the process that happens when you run
prefect server start
c
Oops yeah, I have a separate machine which will run my server and run my worker processes.
And I code on a work computer.
n
👍 that makes sense. where do you generally keep your source code?
ahh i see, git repo on your server
1
c
So I'm wondering if I don't need to make a repo on my separate machine, and just send the deployment from my client computer to the separate machine to run
n
do you happen to have a remote repo? e.g. private github repo if you did, you could push your code, create the Deployment on your prefect server in CI, and then have a
git_clone
pull step (which is a super common pattern)
also sorry, gotta run! will check back in a bit
👍 1
c
I see, yeah I don't and I'd probably want to keep it private. But I understand now the prefect.yaml has to reference the source remote or not, I can't instantiate the environment/deployment first and then send it through the client to the prefect server. thx
n
it sounds like the simplest thing to do in your case is: • every deployment defines its
entrypoint
in prefect.yaml relative to your git repo on your machine • you set a global (top of file, not on each deployment) set_working_directory
pull
step in your prefect.yaml so that (for all deployments) when your worker picks up a flow run, it first sets the working directory to your git repo, so the working directory + that deployment’s entrypoint points at the right file @Marvin do u have a set_working_directory pull step example
m
I'm afraid the code provided does not include an example of the
set_working_directory
pull step. It seems that there might have been a misunderstanding or lack of available specific code examples. However, I can help you create a hypothetical example based on the Prefect 2.0 concepts. Here's how you might typically use a
set_working_directory
pull step in your `prefect.yaml`:
Copy code
yaml
version: 1

# Global `pull` step for all deployments to set working directory
pull:
  - step: some_step_type_here
    name: set_working_directory
    working_directory: "/path/to/your/git/repo"

deployments:
  - name: example-deployment
    entrypoint: flows/example_flow.py
    # ... (other deployment settings)
Please note that the specific step type and parameters for
set_working_directory
action would depend on the actual implementation in Prefect 2.x, which may require looking into the accurate step types provided by Prefect for changing the working directory. The above snippet is a conceptual guide and should be adapted based on the actual documentation and capabilities of your Prefect version. Since the concept of
set_working_directory
is not directly documented with an example, I would suggest referring to the Prefect documentation or community forums for more concrete usage patterns, or consulting directly with the Prefect support where you might get the latest and most accurate guidance.
c
Gotcha
This is how I had it, prefect-local.yaml is the old one for my desktop