Tsvetelin
10/21/2024, 10:34 AMDeployment.build_from_flow(
name="...",
flow=my_flow,
parameters={..},
storage=RemoteFileSystem.load('minio'),
path='...',
version=os.getenv("GIT_COMMIT_SHA"),
infrastructure=DockerContainer(
auto_remove=True,
image='prefect-default-worker:2.20.9',
image_pull_policy='NEVER',
network_mode='bridge',
),
).apply()
Where ‘minio’ is a Remote File System block containing Minio connection data, prefect-default-worker:2.20.9 is an image based on prefecthq/prefect:2.20.9-python3.10 with all the flow’s dependencies installed. I have another container prefect-agent:2.20.9 with
entrypoint: ["prefect", "agent", "start", "-q", "default"]
Reading the migration guide I get that I need to 1) Replace ‘infrastructure’ with ‘work_pool_name’ and 2) Replace ‘storage’ with a call to ‘flow.from_source’ 3) replace entrypoint: ["prefect", "agent", "start", "-q", "default"]
with entrypoint: ["prefect", "worker", "start", "-p", "default"]
The new deployment script looks like this:
my_flow.from_source(
source=RemoteFileSystem.load("minio"),
entrypoint='path to the file with @flow definition'
).deploy(
name="same",
work_pool_name='default',
parameters={same},
version=os.getenv("GIT_COMMIT_SHA"),
build=
False,
push=False,
job_variables=dict(pull_policy="NEVER", network_mode="bridge", auto_remove=True)
)
Before the migration, the script uploads the flow’s python file (and the whole folder) to minio and then create/update the deployment.
Now it doesn’t seem to do the upload to Minio. The script throws an error: FileNotFoundError: [Errno 2] No such file or directory: ‘/var/folders/l9/q51p6spx7sb6j3gqdq5q7jhr0000gn/T/tmpi8gx9s0t/remote-file-system-minio/…’
If I manually put the files in Minio manually It seems to load the flow. I can see that a ScriptError is raised loading the flow’s python file. I’ll share the error after I clarify this issue.
So I wonder if I can keep the previous behaviour, or I should upload the flow’s definition to Minio manually.Marvin
10/21/2024, 10:34 AMHi All,Thank you for being that generous with your time and work. I’m migrating from prefect 2 to prefect 3 and I need to switch to the new deployment mechanism. Unfortunately it doesn’t work as I expect. The old deployment was:
Deployment.build_from_flow(
name="...",
flow=my_flow,
parameters={..},
storage=RemoteFileSystem.load('minio'),
path='...',
version=os.getenv("GIT_COMMIT_SHA"),
infrastructure=DockerContainer(
auto_remove=True,
image='prefect-default-worker:2.20.9',
image_pull_policy='NEVER',
network_mode='bridge',
),
).apply()
Where ‘minio’ is a Remote File System block containing Minio connection data, prefect-default-worker:2.20.9 is an image based on prefecthq/prefect:2.20.9-python3.10 with all the flow’s dependencies installed. I have another container prefect-agent:2.20.9 with
entrypoint: ["prefect", "agent", "start", "-q", "default"]
Reading the migration guide I get that I need to 1) Replace ‘infrastructure’ with ‘work_pool_name’ and 2) Replace ‘storage’ with a call to ‘flow.from_source’ 3) replace entrypoint: ["prefect", "agent", "start", "-q", "default"]
with `Tsvetelin
10/21/2024, 10:35 AMMarvin
10/21/2024, 10:36 AMDeployment.build_from_flow
to using flow.deploy
. The transition involves some changes in how you specify certain configurations. Here are the key points and guidance on how you can update your code:
1. Deployment Method: Replace Deployment.build_from_flow
with flow.deploy
.
2. Infrastructure: The infrastructure
parameter is now replaced with work_pool_name
. If your infrastructure block was using .publish_as_work_pool
, ensure you use the name of that created work pool.
3. Storage: Replace storage
with flow.from_source
. This function allows loading your flow from a remote storage location and making it deployable. You can pass your existing RemoteFileSystem
block to the source
argument.
4. Prefect CLI: You can also use the prefect deploy
command interactively if you prefer CLI over Python script for deployment creation.
For a detailed guide, kindly refer to the official documentation on upgrading to Prefect 3 here and deployment approaches in Prefect 3 here.
Additionally, the latest release notes for Prefect 3 include several bug fixes and improvements. It's always a good idea to make sure you're using the latest version.
I hope this helps with updating your deployment set up for Prefect 3! If you have any more questions, feel free to ask.Tsvetelin
10/21/2024, 10:38 AMMarvin
10/22/2024, 6:17 AMfsspec
and s3fs
libraries are installed, as they are required for working with Minio via the S3 protocol.
3. Python Script to Upload to Minio: You can use the boto3
library to interact with Minio and upload your flow file to a bucket. Here's a basic example to upload a file to Minio:
python
import boto3
from botocore.client import Config
# Connect to Minio
s3_client = boto3.client(
's3',
endpoint_url='<https://minio.example.com>',
aws_access_key_id='YOUR_ACCESS_KEY',
aws_secret_access_key='YOUR_SECRET_KEY',
config=Config(signature_version='s3v4')
)
# Upload a file
s3_client.upload_file(
'path/to/your/flow.py',
'your-bucket-name',
'your-folder/flow.py'
)
4. Specifying Minio Storage in Prefect: When creating a Prefect deployment, specify your flow's location in Minio. You can use fsspec
to point Prefect to the S3-compatible Minio storage:
python
from prefect import flow
if __name__ == "__main__":
flow.from_source(
source="<s3://your-bucket-name/your-folder>",
entrypoint="flow.py:my_flow",
).deploy(
name="deployment-from-minio",
work_pool_name="my-docker-pool",
)
Ensure you have the necessary network permissions and configurations in place, and that your environment can access Minio securely.
For more details, you can check the official documentation on flow code storage options, as it provides insights into storing flow code on different cloud storages like AWS S3, which Minio emulates.Tsvetelin
10/22/2024, 6:44 AMDeployment.build_from_flow(
name="...",
flow=my_flow,
parameters={..},
storage=RemoteFileSystem.load('minio'),
path='...',
version=os.getenv("GIT_COMMIT_SHA"),
infrastructure=DockerContainer(
auto_remove=True,
image='prefect-default-worker:2.20.9',
image_pull_policy='NEVER',
network_mode='bridge',
),
).apply()
was uploading to Minio without external library. How can I achieve that with prefect 3?Marvin
10/22/2024, 6:45 AMRemoteFileSystem
in Prefect 2. This involves creating a storage block configuration for Minio using Prefect's UI or through code.
2. Create Minio Storage Block:
Although the direct Minio storage block isn't mentioned explicitly, you can typically use the S3 compatible options provided by Prefect, as Minio mimics S3 API. You can create such a block that includes your Minio credentials and endpoint details.
3. Using the Flow Deployment Method:
With the storage block set, you can modify your deployment Python script:
python
from prefect import flow
from prefect.blocks.storage import RemoteFileSystem
# Create a Minio storage block if it doesn't exist
minio_storage_block = RemoteFileSystem(
basepath='<s3://your-bucket-name/your-folder>',
settings={
'key': 'YOUR_ACCESS_KEY',
'secret': 'YOUR_SECRET_KEY',
'endpoint': '<https://minio.example.com>'
}
)
minio_storage_block.save(name='minio-storage', overwrite=True)
# Use the storage block in deployment
if __name__ == "__main__":
flow.from_source(
source='<s3://your-bucket-name/your-folder>',
entrypoint='flow.py:my_flow',
storage_block='minio-storage'
).deploy(
name='my-deployment',
work_pool_name='default-pool'
)
4. Deploy the Flow:
Run your script that registers the deployment with Prefect, utilizing the storage block configuration.
5. Manage Blocks in Prefect UI:
You can also manage these blocks directly within the Prefect UI, which allows for visual handling of credentials and connection settings.
For more details on blocks in Prefect and possibly utilizing S3-compatible storage, check the Prefect Blocks Documentation for the latest guidelines and configurations. This approach allows you to maintain your workflow of uploading to Minio without needing third-party packages explicitly.