Akash Rai

    Akash Rai

    1 year ago
    Hey guys. Pretty new to this ETL stuff so forgive my innocence with respect to this topic. I need this orchestration for one of my projects with given use cases.- The tasks are shell scripts. - The shell scripts pass a file path to its descendant in the DAG to proceed with the flow run. - In case of a failure at task A, instead of restarting the whole run, pass the file path to task A dynamically by UI or CLI and proceed with the execution. Using the previous outputs of the ancestors of A. - Have a sub DAG that logically groups my tasks. In UI also this would help me group the tasks. One question I have regarding states is that are state changes atomic operations? Any help is highly appreciated.
    Chris White

    Chris White

    1 year ago
    Hi Akash - this use case sounds great! A few notes: • you can subclass Prefect’s
    ShellTask
    to alter it’s return value; might save you some time • in terms of failures and restarts, as long as you have a
    Result
    configured on your Flow, Prefect will take care of rehydrating the data between your tasks without any extra effort on your end • Prefect does not currently have a grouping mechanism for tasks within a Flow in the UI, but it is something that I expect we will implement I don’t think I understand your question about state changes — could you elaborate?
    Akash Rai

    Akash Rai

    1 year ago
    Hey Chris. Thanks for the suggestions. in terms of state I was referring to the classical dirty read problem. But that wont affect my use case I believe.
    Chris White

    Chris White

    1 year ago
    Gotcha — so Prefect Cloud does offer a feature that we call “version locking” that essentially prevents the dirty read problem in the case where a task might accidentally run twice or incorrectly run simultaneously