Hi. I'm trialing prefect, looks good so far and hy...
# ask-community
c
Hi. I'm trialing prefect, looks good so far and hybrid model in particular is almost a great fit for us. But I'm concerned about data leakage via logs and exceptions as listed here: https://docs.prefect.io/orchestration/faq/dataflow.html. Is there any code out there that'd prevent these from going to the cloud, but still allow them to be accessed locally?
j
Hi @ct - for results you can set checkpointing to false however given that parameters are entered via the API or UI, I'm not sure how you'd prevent them going to cloud. You could however set up your flows to use a path to data or similar so that you don't need to enter anything sensitive as a parameter. If you have any more questions about security I'm happy to connect you to our sales team who are great at answering this type of question!
You can also add your own filter if you want to filter what logs you can share:
Copy code
import prefect
from prefect import task, Flow
import logging
class AwesomeFilter(logging.Filter):
    def filter(self, rec):
        if 'tes' in rec.msg:
            return 0
        # you may need to filter based on `getMessage()` if
        # you can't find the information in the pre-formatted msg field
        return 1
@task
def abc():
    <http://logger.info|logger.info>("teting")        # gets shows
    <http://logger.info|logger.info>("testing")       # gets hidden
    return 1
@task
def bcd():
    <http://logger.info|logger.info>("teting")        # gets shows
    <http://logger.info|logger.info>("testing")       # gets hidden
    return 1
with Flow("test") as flow:
    logger = prefect.context.get("logger")
    logger.addFilter(AwesomeFilter())
    abc()
    bcd()
flow.run()
c
If this is helpful - This is how we approach the exceptions problem. we wrap the prefect “task” decorator with our own decorator to catch exceptions before they bubble up to prefect cloud. They are still logged locally on the agent and can be searched by error type later. Something like this -
Copy code
import traceback
from prefect import task
from functools import partial, wraps

def custom_task(func=None, **task_init_kwargs):
    if func is None:
        return partial(custom_task, **task_init_kwargs)

    @wraps(func)
    def safe_func(**kwargs):
        try:
            return func(**kwargs)
        except Exception as e:
            print(f"Full Traceback: {traceback.format_exc()}")
            raise RuntimeError(type(e)) from None  # from None is necessary to not log the stacktrace

    safe_func.__name__ = func.__name__
    return task(safe_func, **task_init_kwargs)
Then just decorate your functions with @custom_task
c
That's really helpful, thanks both. So if we apply a filter to the logs as Jenny's suggesting, does it still log locally, just not get sent to the cloud?
j
I'd go with Chris's suggestion as it gives you local agent logs too!