Hi Prefect Experts Im looking for a little guidance on using Prefect Community #ask-community

from prefect.tasks.jupyter.jupyter import ExecuteNotebook
execute_jupyter_notebook_task = ExecuteNotebook(output_format="json", log_output=True, path="<s3://bucket/dir/dir/dir/hello_world.ipynb>")
execute_jupyter_notebook_task()

Kevin Kho

05/03/2022, 8:35 PM

What is the question? Are you trying to use it outside a Flow?

Bradley Hurley

05/03/2022, 8:35 PM

Bradley Hurley

05/03/2022, 8:35 PM

Im using it inside a flow, but I just wanted to really strip it down.

Bradley Hurley

05/03/2022, 8:35 PM

I can't seem to get any logs and it doesn't actually seem to be running.

Kevin Kho

05/03/2022, 8:36 PM

Ah I see. Are you sure you can pass an s3 path?

Bradley Hurley

05/03/2022, 8:36 PM

The docs say:

Copy code

path (string, optional): path to fetch the notebook from. Can be a cloud storage path. Can also be provided post-initialization by calling this task instance

Bradley Hurley

05/03/2022, 8:37 PM

My notebook only contains this:

Copy code

import logging
import boto3

logger = logging.getLogger("papermill")
print("Hello World!")
<http://logger.info|logger.info>("Hello World!")

client = boto3.client("s3")
client.put_object(Bucket="Bucket", Key="path/output.txt"), Body="Success")

Kevin Kho

05/03/2022, 8:38 PM

I don’t have any better suggestion except to try the papermill code under the hood and see if it works here

Bradley Hurley

05/03/2022, 8:39 PM

Thanks! I searched the slack here and the internet and wasn't able to find much.

Kevin Kho

05/03/2022, 8:41 PM

Not a ton of people use this but I wrote the tests for this before and I think it works on local notebooks for sure

Bradley Hurley

05/03/2022, 8:43 PM

I did see your name on https://github.com/PrefectHQ/prefect/issues/4265 and the linked/merged PR

Kevin Kho

05/03/2022, 8:44 PM

Yes…that was my first or second week here. Let’s not put that much faith in me either 😅. Would not be surprised if a papermill upgrade caused issues, but I never tested on S3

Bradley Hurley

05/03/2022, 8:46 PM

I will report back with my findings. It might take a day.

👍 1

Kevin Kho

05/03/2022, 8:47 PM

Thank you!

Bradley Hurley

05/04/2022, 5:05 PM

The followup on this was multi-pronged. • We are using sparkmagic and the first thing that happens when a cell is executed is that sparkmagic attempts to create a sparkcontext. The task was reporting success even though when the logs were retrieved from the kubernetes job pod it showed the actual error message. • I overlooked the instructions in the PR and didn't enable the

paepermill

logger via prefect. • fixing the sparkcontext issues and enabling the logger via the env variable

PREFECT__LOGGING__EXTRA_LOGGERS

corrected my issues

7 Views

Open in Slack

Previous Next