Alexis Lucido

    Alexis Lucido

    7 months ago
    Hi all! I am running a Shell Task to kill zombie processes generated by one of my other tasks. The flow was functioning properly until 2 weeks ago but I cannot figure why it stopped working. Here is the associated flow: kill_geckodriver_task = ShellTask( log_stderr=True, return_all=True, stream_output=True) with Flow('kill_geckodriver', schedule=schedules_prefect[ 'kill_geckodriver']) as kill_geckodriver: kill_geckodriver_task(command='source {}'.format(os.path.join( os.environ.get('BASH_SCRIPTS_FOLDER'), 'kill_geckodriver.sh'))) The bash script is below: __ pkill geckodriver pkill firefox I can run the flow when the bash script only echoes a string, so the bug is not due to the flow or the path passed with an environment variable. I guess the problem lies in the sudo rights needed to run the "pkill" command. I have been trying to replace the current script with the following lines (replacing password with the user password), but with no success so far: __

    export HISTIGNORE='*sudo -S*' # to be added in production to avoid logging passwords

    echo "<password>" | sudo - S pkill geckodriver echo "<password>" | sudo -S pkill firefox Unfortunately, the flow still raises an error, and I cannot figure why. I have been trying to log it with the "log_stderr=True, return_all=True, stream_output=True" kwargs of the ShellTask but the only logs I have are joined as a screenshot. Any thoughts about it? The problem is probably password-linked, but I cannot seem to find an appropriate solution. Thanks a lot in advance!
    Kevin Kho

    Kevin Kho

    7 months ago
    Hi @Alexis Lucido, I am not sure but looking at the flow, the
    source {}.format
    to make the command is a bit concerning because this gets evaluated during Flow registration, not during runtime. I don’t know if that’s what you intended, but it would be better if you had a task that returned this instead to defer the execution to runtime. You are right that exit code 1 sounds like a permission issue. Your attempts to get more logs also look right. I don’t know why you aren’t getting more. There is a way to get the whole traceback to show in the logs. I can give an example snippet, but in this case I don’t think it’ll generate anything
    def custom_task(func=None, **task_init_kwargs):
        if func is None:
            return partial(custom_task, **task_init_kwargs)
    
        @wraps(func)
        def safe_func(**kwargs):
            try:
                return func(**kwargs)
            except Exception as e:
                print(f"Full Traceback: {traceback.format_exc()}")
                raise RuntimeError(type(e)) from None  # from None is necessary to not log the stacktrace
    
        safe_func.__name__ = func.__name__
        return task(safe_func, **task_init_kwargs)
    
    @custom_task
    def abc(x):
        return x
    You can modify the ShellTask maybe to add more logging like this
    Alexis Lucido

    Alexis Lucido

    7 months ago
    You're right, maybe the environment variable is not accessed at run time (though it is available at registration time). I'll pass that as a task first thing in the morning and check the result. If this fails still, I'll check up how to get more logs thanks to your snippet, though I have yet to understand why the logging operations did not work in the first place. Thanks a lot !
    Ok so there was indeed an error when trying to access the path of the bash file. The environment variable was not read at run time. Thus, the os.path.join tried to join a None and a string. That gave an error directly in the flow, hence no logs for the bash execution as the script did not go that far. Now I can try to execute the bash file with the attached Flow, but I got another error ("Syntax error near unexpected token..."). I also attached a screenshot of the bug. Here is the content of my .sh file that I have slightly modified : #!/bin/bash pkill geckodriver pkill firefox
    Any idea about this? Thanks a lot!
    PS when my script only contains a single line such as "pkill geckodriver" I get the same error.
    Kevin Kho

    Kevin Kho

    7 months ago
    The latest screen shot still has the
    workflows_compute()…
    . This is computed during build time, not run time so you might not have it during the Flow run. you need it in a task to defer execution
    Alexis Lucido

    Alexis Lucido

    7 months ago
    Yep, workflows_compute.path_geckodriver() is a task. Sorry if this was not clear to begin with
    Kevin Kho

    Kevin Kho

    7 months ago
    Can I see the definition of that? You added the task decorator to a method of a class?
    I thought this didnt work
    Alexis Lucido

    Alexis Lucido

    7 months ago
    Sure. This task is now only a wrapper around a function that goes looking for the environmental variable for bash scripts folder, and the task is contained in another module workflows_compute.py. I attached the function as well, which is what was previously available in the flow. We then have Flow in maintenance.py -> Task in workflows_compute.py -> Function in another misc module. I wanted to separate my flows and my tasks. And, having switched from Airflow to Prefect, I reckoned the importance of having interfaces between my workflows and my basic functions, that should work no matter the workflow manager.
    Kevin Kho

    Kevin Kho

    7 months ago
    Ohh I see.
    workflows_compute
    is a module. I understand. Will take a look again at the code
    The
    source{}.format
    is skill a bit concerning in the Flow because I think that is evaluated during build time
    Alexis Lucido

    Alexis Lucido

    7 months ago
    Ooooh maybe that's the reason I get this weird unexpected token "newline" error... Lemme check with the full path with no {].format
    Ok so the .format expression was a problem indeed. I replaced it with the full path. Not as clean but not that big of a deal either. I have a new error when executing the code. The bash script only contains pkill geckodriver. Here the flow fails. So I replace the line with: echo "<my_password>" | sudo -S pkill geckodriver. And it fails again. I can run the bash script manually, without going through Prefect. Any idea?
    Kevin Kho

    Kevin Kho

    7 months ago
    You can use the format in the Flow is it’s a task to defer execution. Could you show me the code and traceback? Wondering what the actual error is
    Alexis Lucido

    Alexis Lucido

    6 months ago
    Hey Kevin, sorry for my late answer. I have been dealing with other matters. I have reduced the code to its simplest expression (not sourcing an external script), and this is the flow:
    kill_geckodriver_task = ShellTask(
        log_stderr=True, return_all=True, stream_output=True)
    with Flow('kill_geckodriver', schedule=schedules_prefect[
            'kill_geckodriver']) as kill_geckodriver:
        kill_geckodriver_task(command='pkill geckodriver; pkill firefox')
    Here is the traceback:
    Looking up flow metadata... Done
    Creating run for flow 'kill_geckodriver'... Done
    └── Name: gabby-skua
    └── UUID: 54bd621e-a8c0-409f-a8f6-f6d19c34890b
    └── Labels: ['agentless-run-13ef9b7f']
    └── Parameters: {}
    └── Context: {}
    └── URL: <http://localhost:8080/default/flow-run/54bd621e-a8c0-409f-a8f6-f6d19c34890b>
    Executing flow run...
    └── 11:55:03 | INFO    | Creating subprocess to execute flow run...
    └── 11:55:03 | INFO    | Beginning Flow run for 'kill_geckodriver'
    └── 11:55:03 | INFO    | Task 'ShellTask': Starting task run...
    └── 11:55:03 | ERROR   | Command failed with exit code 1
    └── 11:55:03 | INFO    | FAIL signal raised: FAIL('Command failed with exit code 1')
    └── 11:55:04 | INFO    | Task 'ShellTask': Finished task run for task with final state: 'Failed'
    └── 11:55:04 | INFO    | Flow run FAILED: some reference tasks failed.
    Flow run failed!
    These commands raise Exception only when run through Prefect. I can run dummy stuff like "echo 1" with a Shell task though.
    Kevin Kho

    Kevin Kho

    6 months ago
    it looks like exit code 1 with pkill is no process matched based on this ?
    Alexis Lucido

    Alexis Lucido

    6 months ago
    YES! That was it: no process was found and a 1 exit code was returned, though we wanted to set that case as a success. Here is our new workflow. First, the flow:
    kill_process_task = ShellTask(
        log_stderr=True, return_all=True, stream_output=True)
    with Flow('kill_geckodriver', schedule=schedules_prefect[
            'kill_geckodriver']) as kill_geckodriver:
        kill_process_task(command='source bash/kill_geckodriver.sh')
        kill_process_task(command='source bash/kill_firefox.sh')
    Then, the scripts (I only write one of both but they are identical, except for the process killed):
    #!/bin/bash
    
    pkill geckodriver
    pkillexitstatus=$?
    
    if [ "$pkillexitstatus" -eq "0" ]; then
        echo "One or more processes matched the criteria and have been killed. Operation successful."
        return 0
    elif [ "$pkillexitstatus" -eq "1" ]; then
        echo "No processes matched. Operation successful."
        return 0
    elif [ "$pkillexitstatus" -eq "2" ]; then
        echo "Syntax error in the command line. Failure."
        return 1
    elif [ "$pkillexitstatus" -eq "3" ]; then
        echo "Fatal error. Failure."
        return 1
    else
        echo "UNEXPECTED. Failure."
        return 1
    fi
    In the end, that was a Linux issue... We had troubles with geckodriver processes not correctly killed by our Python tasks so we added this maintenance flow, and we'd rather keep it as a security. Thanks a lot Kevin!
    Kevin Kho

    Kevin Kho

    6 months ago
    Nice!