<@ULVA73B9P> to create artifacts, is it needed to ...
# ask-marvin
p
@Marvin to create artifacts, is it needed to have some artifact backend like artifactory?
m
thought for 28.7 seconds
No, you don't need an external artifact backend like Artifactory to use Prefect artifacts. Prefect artifacts are automatically stored in your Prefect server (which you're running via Helm chart) or Prefect Cloud. Prefect artifacts are designed to be human-readable visualizations and metadata that help you track and debug your flows. They support various formats including: - Tables - Progress indicators - Images - Markdown - Links Here's a simple example of how to create an artifact:
Copy code
python
from prefect import flow
from prefect.artifacts import create_markdown_artifact

@flow
def example_flow():
    create_markdown_artifact(
        key="summary",
        markdown_content="## Flow Results\nProcessed 1000 records successfully",
    )
Since you're using a self-hosted Prefect server, the artifacts will be stored directly in your server's database. You can view them in the Prefect UI after they're created. The artifacts system is particularly useful for: - Visualizing flow outputs - Tracking progress over time - Debugging by seeing when and where results were written - Creating links to external storage locations (like S3 buckets) You can find more details about artifacts in the official documentation.
p
@Marvin so artifacts are not suitable for storing binary outputs from flows? Like pdf files
m
thought for 71.2 seconds
No, artifacts are not suitable for storing binary outputs like PDF files. Artifacts are designed for human-readable metadata and visualizations that can be displayed directly in the Prefect UI. For binary files like PDFs, you should: 1. Store the files in an external storage system (like AWS S3, GCS, or any other file storage system) 2. Optionally create a link artifact that points to where the file is stored Here's an example approach:
Copy code
python
from prefect import flow
from prefect.artifacts import create_link_artifact
import boto3  # or your preferred storage client

@flow
def generate_pdf_report():
    # Generate your PDF here
    pdf_content = generate_pdf()  # Your PDF generation logic
    
    # Store the PDF in your storage system
    s3 = boto3.client('s3')
    s3.put_object(
        Bucket='your-bucket',
        Key='reports/report.pdf',
        Body=pdf_content
    )
    
    # Create a link artifact to the stored PDF
    create_link_artifact(
        key="pdf_report",
        link="<https://your-bucket.s3.amazonaws.com/reports/report.pdf>",
        description="Generated PDF Report"
    )
For proper handling of binary outputs, you should: 1. Use appropriate storage systems (S3, GCS, Azure Blob Storage, etc.) 2. Consider implementing proper versioning and cleanup policies 3. Use artifacts to create links to these stored files, making them easily accessible from the Prefect UI 4. Consider using Prefect's result storage functionality for managing where outputs are stored This separation of concerns (storing binary data in appropriate storage systems while using artifacts for metadata and links) is the recommended approach for handling binary outputs in your flows.