What does it mean when on invoking a system call from within Prefect Community #ask-community

What does it mean when on invoking a system call f...

Jay Sundaram

04/08/2021, 3:08 PM

What does it mean when on invoking a system call from within a task, a <Parameter> is used(?) instead of the expected string?

Copy code

--destination_bucket_name <Parameter: destination_bucket_name>

when was expecting:

Copy code

--destination_bucket_name myorg-S3-bucket

Dylan

04/08/2021, 3:11 PM

Hi @Jay Sundaram! Can you share a short reproducible example?

Dylan

04/08/2021, 3:13 PM

Execution within tasks is deferred to runtime

Dylan

04/08/2021, 3:13 PM

So, if you’re passing a

Parameter

as an argument to a Task, that parameter isn’t populated until runtime, either

Dylan

04/08/2021, 3:14 PM

When the script is loaded (a moment we sometimes call “build” or “definition” time) none of the code inside your tasks is run

Dylan

04/08/2021, 3:14 PM

This time is used for Prefect to figure out which tasks depend on each other and to define a dependency graph

Dylan

04/08/2021, 3:14 PM

Then, when you call

flow.run()

locally or an agent starts a flow run, parameters are defined and the code inside your tasks is executed

Dylan

04/08/2021, 3:15 PM

This sounds like you may be printing something at script definition instead of run time

Jay Sundaram

04/08/2021, 3:17 PM

Copy code

def _execute_cmd(cmd):
    p = subprocess.Popen(cmd, shell=True)
    ...

@task()
def convert(destination_bucket_name):
    cmd = f"myexecutable --destination_bucket_name {destination_bucket_name}"
    _execute_cmd(cmd)


with Flow('myflow') as flow:
    destination_bucket_name = Parameter("destination_bucket_name", DEFAULT_DESTINATION_BUCKET_NAME)
    convert(destination_bucket_name)

Something like this.

Kevin Kho

04/08/2021, 3:18 PM

Hey @Jay Sundaram, it might help you to use our

ShellTask

to run that subprocess call: https://docs.prefect.io/api/latest/tasks/shell.html

upvote 1

Dylan

04/08/2021, 3:18 PM

That was going to be my suggestion 😄

Kevin Kho

04/08/2021, 3:19 PM

This should defer everything to run time and help you use the Parameter

Jay Sundaram

04/08/2021, 3:26 PM

How do ensure downstream tasks wait for the completion of the shell task?

Dylan

04/08/2021, 3:26 PM

There are two ways to accomplish this

Dylan

04/08/2021, 3:27 PM

If a task returns something, passing that result to the next task will create the dependency automatically

Dylan

04/08/2021, 3:27 PM

You can also explicitly set dependencies between tasks

Dylan

04/08/2021, 3:28 PM

https://docs.prefect.io/api/latest/core/task.html#task-2

Jay Sundaram

04/08/2021, 3:28 PM

the shell task will execute a command-line executable so nothing really return

Dylan

04/08/2021, 3:28 PM

Check out

set_upstream

Dylan

04/08/2021, 3:28 PM

methods

Dylan

04/08/2021, 3:28 PM

In that doc

Jay Sundaram

04/08/2021, 3:32 PM

So something like this it looks like. Perhaps?

Copy code

@task()
def something():
    print("something")

my_shell_task = ShellTask()

with Flow('myflow') as flow:
    destination_bucket_name = Parameter("destination_bucket_name", DEFAULT_DESTINATION_BUCKET_NAME)
    cmd = f"myexecutable --destination_bucket_name {destination_bucket_name}"
    my_shell_task(command=cmd)
    something(upstream_tasks=[my_shell_task])

Kevin Kho

04/08/2021, 3:34 PM

I don’t think this will work because the Parameter is not defined when you define

cmd

. This will work if you wrap

cmd

as a task. Example below:

Kevin Kho

04/08/2021, 3:35 PM

Copy code

@task
def create_cmd(destimation_bucket_name):
    return f"myexecutable --destination_bucket_name {destination_bucket_name}"

Kevin Kho

04/08/2021, 3:36 PM

then do

cmd = create_cmd()

. but yes to the upstream task dependency being set there

Kevin Kho

04/08/2021, 3:37 PM

Actually sorry. The way to do upstream tasks is like this:

Copy code

something()
something.set_upstream(my_shell_task)

Jay Sundaram

04/08/2021, 3:40 PM

and if something returns an object:

Copy code

something()
myobject = something.set_upstream(my_shell_task)

is this correct?

Kevin Kho

04/08/2021, 3:42 PM

i think you want myobject to get something from

something()

so i think this should be

Copy code

myobject = something()
myobject.set_upstream(my_shell_task)

Kevin Kho

04/08/2021, 3:44 PM

I think this is a good snippet to illustrate. Chech the code block

Jay Sundaram

04/08/2021, 3:49 PM

Copy code

NameError: name 'myobject' is not defined

🤔

Jay Sundaram

04/08/2021, 3:50 PM

also:

Copy code

TypeError: run() missing required argument: 'command'

Kevin Kho

04/08/2021, 3:50 PM

Let me make an example for you

Kevin Kho

04/08/2021, 3:55 PM

Copy code

import prefect
from prefect import Task
from prefect import task, Flow, Parameter
from prefect.tasks.shell import ShellTask

shell_task = ShellTask(
    helper_script="",
    shell="bash",
    log_stderr=True,
    return_all=True,
    stream_output=True,
)

@task
def get_command(param):
    return f"echo '{param}'"

@task
def other_task():
    return 2

with Flow(name="Test") as flow:
    param = Parameter('param')
    cmd = get_command(param)
    test = shell_task(command=cmd)

    something = other_task()
    something.set_upstream(test)

flow.run(parameters=dict(param='a'))

Kevin Kho

04/08/2021, 3:56 PM

I think you weren’t passing the command to the shell task during runtime

Jay Sundaram

04/08/2021, 4:03 PM

The mistake was I had set the .set_upstream() parameter to shell_task instead shell_task's return context(?) What is "test" in this case?

Kevin Kho

04/08/2021, 4:05 PM

Something to hold the result of shell_task but let me find someone who can give a better answer

Kevin Kho

04/08/2021, 4:06 PM

If you’re not setting an upstream to the shell task, you don’t need

test

in the code snippet

Jay Sundaram

04/08/2021, 4:07 PM

Hmm. Don't understand that statement. I want to make sure that downstream tasks wait for the shell command to finish execution.

Kevin Kho

04/08/2021, 4:10 PM

Sorry was unclear. Think of

test

as something to hold the state of the

shell_task

. We are setting

test

as an upstream dependency meaning we want to see the STATE of

shell_task

be SUCCESS before

other_task

runs. If we don’t have any dependency requirements, we don’t need to store the STATE of

shell_task

into the

test

variable because we don’t need to track it. Does this help?

Jay Sundaram

04/08/2021, 4:12 PM

hmm- so, what do you pass as parameter to something.set_upstream() to ensure that something waits for shell_task to complete?

Jay Sundaram

04/08/2021, 4:14 PM

also, this:

Copy code

<http://logger.info|logger.info>(f"command is: '{cmd}'")

is yielding this in the UI log:

Copy code

ommand is: '<Task: create_cmd>'

Kevin Kho

04/08/2021, 4:17 PM

You want the logger to happen inside a task. Doc. This won’t work because Task hasn’t run yet.

👍 1

Kevin Kho

04/08/2021, 4:18 PM

On your other question, I’ll get someone else to respond

Dylan

04/08/2021, 4:20 PM

Hey @Jay Sundaram! It would really help us out if you can show us longer code snippets when you’re referencing specific log outputs, like here: https://prefect-community.slack.com/archives/CL09KU1K7/p1617898495246400?thread_ts=1617894495.235500&cid=CL09KU1K7

Dylan

04/08/2021, 4:21 PM

Also, we have dedicated professional services options if you need some help getting up and running with Prefect at your organization. Email us at sales@prefect.io for more details

Dylan

04/08/2021, 4:22 PM

regarding https://prefect-community.slack.com/archives/CL09KU1K7/p1617898358246100?thread_ts=1617894495.235500&cid=CL09KU1K7

Dylan

04/08/2021, 4:22 PM

Have you tried running Kevin’s example? https://prefect-community.slack.com/archives/CL09KU1K7/p1617897308244200?thread_ts=1617894495.235500&cid=CL09KU1K7

Jay Sundaram

04/08/2021, 4:23 PM

yup at the time he posted it. thanks

Jay Sundaram

04/08/2021, 4:23 PM

yup i've pinged a team member about the support services

👍 1

Dylan

04/08/2021, 4:24 PM

did passing

test

from the example ensure that the shell task ran before the

other_task

Jay Sundaram

04/08/2021, 4:28 PM

yeah. the recommendation was to remove

test

, so i was just asking if we remove assigning the invocation of

shell_task()

to test, what would pass to

set_upstream()

seems like the assignment should be preserved in the code

test = shell_task()

so that there is something to pass to

set_upstream()

Kevin Kho

04/08/2021, 4:31 PM

test = shell_task()

holds both

STATE

and the

RESULT

. Setting upstream dependencies like that use the

STATE

, but not

RESULT

so we still need

test

to hold the

STATE

Kevin Kho

04/08/2021, 4:31 PM

Even if it doesn’t return anything (Result is None)

👍 1

Open in Slack

Previous Next