Mini Khosla
05/03/2024, 9:51 AMMarvin
05/03/2024, 9:52 AM--storage-block
flag by using the prefect deployment build
command, a default LocalFileSystem
block is used. This will use your present working directory as its basepath. An agent running on the same machine will be able to successfully run your deployment. You may verify the default block's settings by inspecting the deployment.yaml
file that is created after the build command.
2. Local flow code storage for experimentation: When deploying a flow with local storage using the interactive CLI experience, you can opt for a Local Subprocess
work pool. Run prefect deploy
from the root directory containing your flow code, and follow the prompts to specify details such as your flow code entrypoint and deployment name.
3. Switching to production storage: When moving to production, you might prefer using infrastructure-specific configurations for your code storage, such as git or cloud storage options depending on your requirements.
Please note that local file system storage is not recommended for running deployments on other machines as it assumes that an agent will have access to the same local path as the one from which the deployment was built. For distributed or cloud environments, you should use other storage options that are accessible from your execution environment, like remote file systems or cloud storage blocks.
For more details on configuring storage for deployment, you may check out the following resources from the Prefect documentation:
- General Storage Concepts
- Guide on Where to Store Your Flow Code
Remember to substitute any outdated terminology like "agents" with the updated "workers" throughout your implementation, as agents are deprecated in favor of workers.Mini Khosla
05/03/2024, 1:18 PMMarvin
05/03/2024, 1:18 PMpython
from prefect import flow
@flow(log_prints=True)
def my_workflow():
print("I can call any python code here, including Prefect tasks or flows")
if __name__ == "__main__":
my_workflow()
# If you were to serve or deploy, use an adjusted path like "../my-flow.py"
When you're ready to create a deployment from the directory above your current directory, you would use the CLI command and specify the path to your flow file, which would look similar to the example below:
bash
prefect deploy path/to/deployment.yaml -n <DEPLOYMENT-NAME> --flow-entrypoint ../my-flow.py
You would replace path/to/deployment.yaml
with the path to your deployment YAML if you have one, <DEPLOYMENT-NAME>
with the name of your deployment, and ../my-flow.py
with the relative path to your flow file, which is one level above the current directory.
Please note that this example is indicative of how to use a relative path; specific CLI commands may vary based on Prefect version and the exact environment setup. For exact details on the CLI commands and options, you should always refer to the Prefect documentation and the official Prefect Github repository.Mini Khosla
05/03/2024, 1:31 PMMini Khosla
05/03/2024, 1:31 PMMarvin
05/03/2024, 1:31 PMrefresh_cache
option for a task is used to force the task to update the data associated with its cache key, effectively ignoring any existing cached results. When refresh_cache
is set to True
for a task, it will always perform its computation and refresh the cache, regardless of whether there are cached results available under the same cache key.
An example use case for refresh_cache
would be when a task is generally expensive and you'd usually want to use cached results, but occasionally you need to ensure the task runs again to fetch the most up-to-date data. By setting refresh_cache=True
, you can enforce the task to rerun and update its cache.
Here's an example of how you might use `refresh_cache`:
python
from prefect import task
def static_cache_key(context, parameters):
# return a constant cache key
return "static cache key"
@task(cache_key_fn=static_cache_key, refresh_cache=True)
def caching_task():
# This code will always run and update the cache, instead of using cached data
return random.random()
In the example, caching_task
is decorated with @task
and refresh_cache=True
, which means it will ignore any cached results and will perform its computations every time it is run.
You can learn more about task caching and the refresh_cache
option from the Prefect documentation: Task Caching and Refreshing the Cache.