Oliver Tedder
12/05/2024, 3:00 PMmywheel
that contains the code i want to run. The wheel has a module mymod
in it that i want to run.
The wheel is published to artifactory/jfrog.
I have a prefect worker on a computer that has pip set up so that it is able to pip install the package from the artifactory.
• for example, i can do pip install mywheel
in a terminal on that computer and it can successfully retrieve and install the wheel
• I can then run the code with python -m mywheel.mymod
How can i make a prefect deployment that will:
• run on that worker
• pull the wheel from artifactory and run the mymod
moduleMarvin
12/05/2024, 3:01 PMmymod
module after ensuring the wheel is installed.
2. Write the Logic to Install and Run the Module:
In your flow definition, include logic to invoke the Artifactory pip install
command. Here's a pseudocode example to get you started:
python
from prefect import flow
import subprocess
@flow
def run_my_wheel():
# Install the wheel from Artifactory
subprocess.run(["pip", "install", "mywheel"])
# Run the module
subprocess.run(["python", "-m", "mywheel.mymod"])
3. Deploy the Flow:
Create a deployment that targets your specific worker by setting its configuration appropriately in the deployment script or UI. Specify the work pool that the deployment should use. This allows the worker to pull the flow and execute it.
4. Set up Work Pools and Workers:
Make sure your work pool is pointed to the machine where pip install
successfully works, as you have described.
5. Run the Deployment:
Use the Prefect CLI or Prefect UI to start your deployment. Once started, it will trigger the worker, which will then execute the flow, installing the wheel and running the module.
Here's a couple of useful links to Prefect's documentation that might help you with additional setup details:
- Prefect 2.x Work Pools: Work Pools Documentation
- Prefect 3.x Installation and Setup: Installation Documentation
If you find this setup a bit mundane, do not worry; it's a great way to automate your deployments, so it's worth the effort! If you need any further assistance or a debugging sidekick, I'm here to help.Oliver Tedder
12/05/2024, 3:10 PMmywheel.mymod
inclues prefect flows and tasks, will these be correctly run if I run the process externally like that?Oliver Tedder
12/05/2024, 4:21 PMgitlab_repo = GitRepository(
url="<https://gitlab.com/mycompany/myrepo.git>",
branch="mybranch",
include_submodules=True
)
flow.from_source(
source=gitlab_repo,
entrypoint="myrepo/myfun.py:myfunc",
).deploy(
name="mydeployment",
work_pool_name="myworkpool",
)
However i get this error:
(venv) D:\Git\myrepo>python prefect_deploy.py
Traceback (most recent call last):
File "D:\Git\myrepo\prefect_deploy.py", line 20, in <module>
flow.from_source(
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\utilities\asyncutils.py", line 399, in coroutine_wrapper
return run_coro_as_sync(ctx_call())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\utilities\asyncutils.py", line 243, in run_coro_as_sync
return call.result()
^^^^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\_internal\concurrency\calls.py", line 312, in result
return self.future.result(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\_internal\concurrency\calls.py", line 182, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\concurrent\futures\_base.py", line 401, in __get_result
raise self._exception
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\_internal\concurrency\calls.py", line 383, in _run_async
result = await coro
^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\utilities\asyncutils.py", line 225, in coroutine_wrapper
return await task
^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\utilities\asyncutils.py", line 389, in ctx_call
result = await async_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\flows.py", line 1035, in from_source
with tempfile.TemporaryDirectory() as tmpdir:
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\tempfile.py", line 943, in __exit__
self.cleanup()
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\tempfile.py", line 947, in cleanup
self._rmtree(self.name, ignore_errors=self._ignore_cleanup_errors)
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\tempfile.py", line 929, in _rmtree
_shutil.rmtree(name, onerror=onerror)
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 787, in rmtree
return _rmtree_unsafe(path, onerror)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 629, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 629, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 629, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
[Previous line repeated 2 more times]
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 638, in _rmtree_unsafe
onerror(os.rmdir, path, sys.exc_info())
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 636, in _rmtree_unsafe
os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\OLIVER~1.TED\\AppData\\Local\\Temp\\tmp8ocl5mq4\\mybranch\\.git\\logs\\refs\\heads'
what is going on here?Marvin
12/05/2024, 4:21 PMI am deploying using a gitlab repo instead.I am running this python deployment script
gitlab_repo = GitRepository(
url="<https://gitlab.com/mycompany/myrepo.git>",
branch="mybranch",
include_submodules=True
)
flow.from_source(
source=gitlab_repo,
entrypoint="myrepo/myfun.py:myfunc",
).deploy(
name="mydeployment",
work_pool_name="myworkpool",
)
However i get this error:
```(venv) D:\Git\myrepo>python prefect_deploy.py
Traceback (most recent call last):
File "D:\Git\myrepo\prefect_deploy.py", line 20, in <module>
flow.from_source(
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\utilities\asyncutils.py", line 399, in coroutine_wrapper
return run_coro_as_sync(ctx_call())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\utilities\asyncutils.py", line 243, in run_coro_as_sync
return call.result()
^^^^^^^^^^^^^
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\_internal\concurrency\Marvin
12/05/2024, 4:22 PMI am deploying using a gitlab repo instead.I am running this python deployment script
gitlab_repo = GitRepository(
url="<https://gitlab.com/mycompany/myrepo.git>",
branch="mybranch",
include_submodules=True
)
flow.from_source(
source=gitlab_repo,
entrypoint="myrepo/myfun.py:myfunc",
).deploy(
name="mydeployment",
work_pool_name="myworkpool",
)
However i get this error:
```(venv) D:\Git\myrepo>python prefect_deploy.py
Traceback (most recent call last):
File "D:\Git\myrepo\prefect_deploy.py", line 20, in <module>
flow.from_source(
File "D:\Git\myrepo\venv\Lib\site-packages\prefect\utilities\asyncutils.py", line 399, in coroutine_wrapper
return run_coro_as_sync(ctx_call())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
....
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 629, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
[Previous line repeated 2 more times]
File "C:\Users\Oliver.TedOliver Tedder
12/05/2024, 4:23 PMgitlab_repo = GitRepository(
url="<https://gitlab.com/mycompany/myrepo.git>",
branch="mybranch",
include_submodules=True
)
flow.from_source(
source=gitlab_repo,
entrypoint="myrepo/myfun.py:myfunc",
).deploy(
name="mydeployment",
work_pool_name="myworkpool",
)
However i get this error:
...
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 638, in _rmtree_unsafe
onerror(os.rmdir, path, sys.exc_info())
File "C:\Users\Oliver.Tedder\.pyenv\pyenv-win\versions\3.11.9\Lib\shutil.py", line 636, in _rmtree_unsafe
os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\OLIVER~1.TED\\AppData\\Local\\Temp\\tmp8ocl5mq4\\mybranch\\.git\\logs\\refs\\heads'
what is going on here?Marvin
12/05/2024, 4:23 PMOliver Tedder
12/05/2024, 4:25 PMOliver Tedder
12/05/2024, 4:39 PMMarvin
12/05/2024, 4:39 PMTMPDIR
Environment Variable:
This variable tells the system where to store temporary files. You can set it to a different path where you have full permissions.
- On Windows:
You can set it temporarily within your script or terminal session:
python
import os
import tempfile
# Choose a custom temporary directory path
custom_temp_dir = "C:\\path_to_custom_temp\\"
os.environ['TMPDIR'] = custom_temp_dir
tempfile.tempdir = custom_temp_dir # Also set it for the tempfile module
- Alternatively, set it globally via the System Properties:
1. Search for "Environment Variables" in the Windows Start menu.
2. In the "System Properties" window, click on "Environment Variables...".
3. Create a new variable named TMPDIR
and set it to your desired path.
2. Git Configuration for Temporary Directory:
You might also consider configuring Git to handle temporary files differently, although changing TMPDIR
will usually cover this.
Remember to ensure that the directory you're setting as the new temporary directory has appropriate write permissions for your working user. This change should redirect temporary file handling to the new path you've specified, potentially resolving issues related to file locks or permission errors.
If things still go awry, let me know, and we can continue delving into this mystery together!Oliver Tedder
12/05/2024, 4:41 PMMarvin
12/05/2024, 4:42 PMpython
import shutil
def cleanup_temp_directory(path):
try:
shutil.rmtree(path)
except Exception as e:
print(f"Failed to delete directory {path}. Reason: {e}")
# Usage
cleanup_temp_directory(custom_temp_dir)
2. Investigate Underlying Git Behavior:
Ensure that there's no process hanging onto files in that directory after the deployment step is completed. Tools like Sysinternals' Process Explorer can help identify any processes holding file locks.
3. Check Prefect Configuration:
Double-check Prefect’s configurations or any custom scripts you employ to see if there's supposed to be a step that cleans up these directories that isn't executing correctly due to either missed configuration or error handling.
4. Error Handling Improvements:
Add error handling around sections that could result in this OSError
to potentially log more detailed information about what's preventing the deletion.
5. Scheduled Task for Cleanup:
As a last resort, you might consider setting up a scheduled cleanup task using a script or task scheduler to periodically clear these directories until a more permanent solution is found.
If the issue persists, another step might be reaching out for support on Prefect's forums or considering creating a GitHub issue to highlight and resolve this behavior.
Feel free to reach out again if you have more questions or need help on another aspect of this issue!Oliver Tedder
12/05/2024, 5:34 PMprefect.exceptions.PrefectHTTPStatusError: Client error '422 Unprocessable Entity' for url '<http://192.168.10.8:4200/api/flows/>'
Response: {'exception_message': 'Invalid request received.', 'exception_detail': [{'type': 'extra_forbidden', 'loc': ['body', 'labels'], 'msg': 'Extra inputs are not permitted', 'input': {}}], 'request_body': {'name': 'run-prefect', 'tags': [], 'labels': {}}}
For more information check: <https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422>
what does that mean?Marvin
12/05/2024, 5:34 PMlabels
in the request body as causing the issue.
Here's a breakdown of what might be going wrong and how you could address it:
1. Labels and Inputs:
The error suggests that there is an unexpected "labels" attribute in your deployment configuration. In Prefect, labels may have been deprecated, replaced by tags or other methods, depending on the version you’re using. Since it indicates "extra inputs," Prefect is rejecting the inclusion of the labels field.
2. Check Deployment Configuration:
Review your deployment script or settings to make sure you're not including fields, like "labels," that are no longer supported due to version updates or changes. Prefect's API might now expect tags or other configuration options instead of labels.
3. Modify or Update Code:
Attempt to remove or modify the line that involves labels
. Ensure your deployment configuration uses elements compatible with Prefect's accepted parameters:
python
# Example, remove 'labels' and keep 'tags' if needed
.deploy(
name="mydeployment",
work_pool_name="myworkpool",
tags=[],
# Remove or update labels if necessary
)
4. Documentation and Release Notes:
Check the Prefect documentation and latest release notes for any changes in deployment configuration that might have caused this problem.
Solving this error involves aligning the deployment parameters with Prefect’s current expectations for allowed and disallowed inputs. Should you need more information about modifications in Prefect's configuration fields, consulting the documentation or release notes is an excellent next step.
If you have questions or need more help, do let me know!Oliver Tedder
12/05/2024, 5:37 PMgitlab_repo = GitRepository(
url="<https://gitlab.com/mycompany/myrepo.git>",
branch="mybranch",
include_submodules=True
)
flow.from_source(
source=gitlab_repo,
entrypoint="myrepo/myfun.py:myfunc",
).deploy(
name="mydeployment",
work_pool_name="myworkpool",
tags=[],
)
no mention of labels. I added the tags=[]
param just incase but still the same errorMarvin
12/05/2024, 5:38 PMlabels
, yet you're encountering an error related to labels
, it might be one of the following:
1. Implicit Configuration:
There might be an implicit configuration or default settings somewhere in your environment, Prefect server configuration, or within the API client that's adding labels or expecting certain formats.
2. Prefect Version Mismatch:
Ensure that the version of your Prefect library is compatible with the server you are trying to communicate with. There may have been changes in how deployments are expected to be configured that could cause such errors.
3. Server-Side Configuration:
Check if there are any server-side configurations or webhooks that might interact with the deployment payload, inserting unnecessary fields.
4. Environment Variables:
Sometimes environment variables can set default values that might not be visible in your code.
5. Custom Modifications:
If there are any custom scripts or middleware between your deployment script and Prefect API that could modify the request body, check and ensure they are configured properly.
Here's a plan to diagnose and potentially resolve the issue:
- Verify Versions: Ensure both the Prefect library and server are up-to-date and compatible.
- Check for Implicit Defaults: Check if there are implicit default configurations either on the client-side or server-side that automatically include labels
.
- Server Logs and Configuration: Look into Prefect server logs for any insights into why this error is occurring. If you control the server, see if there are any configurations that might add or expect certain fields.
If none of these steps work, you might consider escalating the issue on Prefect’s forums, where more technical insights specific to their system can be offered. Or, contact the support team if you're using a managed Prefect service for guidance tailored to your setup. Keep up the resilience; debugging is part art, part science!Oliver Tedder
12/06/2024, 11:42 AMMarvin
12/06/2024, 11:42 AM