https://prefect.io logo
Title
b

Bradley Hurley

05/03/2022, 8:34 PM
Hi Prefect Experts - Im looking for a little guidance on using the
ExecuteNotebook
Task.
I have a really basic test flow that is something like:
from prefect.tasks.jupyter.jupyter import ExecuteNotebook
execute_jupyter_notebook_task = ExecuteNotebook(output_format="json", log_output=True, path="<s3://bucket/dir/dir/dir/hello_world.ipynb>")
execute_jupyter_notebook_task()
k

Kevin Kho

05/03/2022, 8:35 PM
What is the question? Are you trying to use it outside a Flow?
b

Bradley Hurley

05/03/2022, 8:35 PM
Im using it inside a flow, but I just wanted to really strip it down.
I can't seem to get any logs and it doesn't actually seem to be running.
k

Kevin Kho

05/03/2022, 8:36 PM
Ah I see. Are you sure you can pass an s3 path?
b

Bradley Hurley

05/03/2022, 8:36 PM
The docs say:
path (string, optional): path to fetch the notebook from. Can be a cloud storage path. Can also be provided post-initialization by calling this task instance
My notebook only contains this:
import logging
import boto3

logger = logging.getLogger("papermill")
print("Hello World!")
<http://logger.info|logger.info>("Hello World!")

client = boto3.client("s3")
client.put_object(Bucket="Bucket", Key="path/output.txt"), Body="Success")
k

Kevin Kho

05/03/2022, 8:38 PM
I don’t have any better suggestion except to try the papermill code under the hood and see if it works here
b

Bradley Hurley

05/03/2022, 8:39 PM
Thanks! I searched the slack here and the internet and wasn't able to find much.
k

Kevin Kho

05/03/2022, 8:41 PM
Not a ton of people use this but I wrote the tests for this before and I think it works on local notebooks for sure
b

Bradley Hurley

05/03/2022, 8:43 PM
I did see your name on https://github.com/PrefectHQ/prefect/issues/4265 and the linked/merged PR
k

Kevin Kho

05/03/2022, 8:44 PM
Yes…that was my first or second week here. Let’s not put that much faith in me either 😅. Would not be surprised if a papermill upgrade caused issues, but I never tested on S3
b

Bradley Hurley

05/03/2022, 8:46 PM
I will report back with my findings. It might take a day.
👍 1
k

Kevin Kho

05/03/2022, 8:47 PM
Thank you!
b

Bradley Hurley

05/04/2022, 5:05 PM
The followup on this was multi-pronged. • We are using sparkmagic and the first thing that happens when a cell is executed is that sparkmagic attempts to create a sparkcontext. The task was reporting success even though when the logs were retrieved from the kubernetes job pod it showed the actual error message. • I overlooked the instructions in the PR and didn't enable the
paepermill
logger via prefect. • fixing the sparkcontext issues and enabling the logger via the env variable
PREFECT__LOGGING__EXTRA_LOGGERS
corrected my issues