Hello! I was trying to migrate my legacy data pipe...
# ask-community
g
Hello! I was trying to migrate my legacy data pipeline into a prefect flow, by refactoring the
main()
of each job into a task. This runs fine, but the problem is that the logs for the functions outside of each the
main()
method will not be logged using
prefect.context
and hence not appear on the prefect UI. Is there a way for Prefect to capture all logs without explicitly changing the logger for each of the sub functions to use that of
prefect.context
? I know that making each job into a subprocess and then redirecting the stdout into Prefect's logger will work, but would like to avoid using subprocesses in general. Besides, this method also seems very hackish haha.
The each task calls subfunctions from other files like shown below:
Copy code
@task
def process_data():
    ...
    while start_date < end_date:
         processor.run()
    ...
The logs for
run()
will not be captured by prefect.
a
g
Thanks for the reply! Throughout my projects, in every file, there is a logger initialised this way:
Copy code
import logging
logger = logging.getLogger(__name__)
Can you advise me on how I can get those logs to display on prefect server?
k
Hey @Goh Rui Zhi, my understanding is that doing it that way preserves the hierarchy of the logging in module. So if you have logs that are like
module.submodule
and
module.submodule2
. Attaching
module
to the extra loggers like in the link Amanda showed will work in adding all.
g
I see, I will try that in a bit. Thank you both :)
Hmmm somehow my local prefect configs are not sent over to my the prefect-server when I do flow.register(). I tried directly setting the environment variables on the prefect-server UI to include my extra loggers and it worked perfectly. What am I missing here?
k
Could you show me how you set the environment variables? Maybe the syntax was just a bit off?
g
I simply did
Copy code
export PREFECT__LOGGING__EXTRA_LOGGERS="['my_module']"
The prefect server is running on GKE.
k
The environment variables from your local machine won’t get carried over. It needs to be set where the Flow is running. Could you try doing it through the RunConfig with
KubernetesRun(env={"PREFECT__LOGGING__EXTRA_LOGGERS": "['my_module']"})
👍 1
g
That's exactly what I needed! Thanks again @Kevin Kho :)
👍 1