Jessica Smith

    Jessica Smith

    3 months ago
    Curious if there is a best practice on something I'm struggling with. I have a task that creates a sql alchemy engine from a parameter, and this is passed on to the next set of tasks. However, you can't pickle a sql alchemy engine, which means I have checkpointing turned off for the Get Engine task. This means that if one of the queries fails and retries it will not have the input it needs. Do I need to just create the engine inside the task? Or is there another way people have used?
    Is there a way I can have the GetEngine task marked to be re-ran before the query task retries?
    Anna Geller

    Anna Geller

    3 months ago
    I understand what you mean - it's considered best practice to define the sqlalchemy engine within a task, but you can also pass the connection string as a parameter value or as a Secret to downstream tasks
    Kevin Kho

    Kevin Kho

    3 months ago
    You can’t retry an upstream task based on the result/state of a downstream task in 1.0
    Jessica Smith

    Jessica Smith

    3 months ago
    Ah, I could replace the GetEngine task with a GetConnectionString task
    that would do the same thing and be pickleable
    thanks for the idea, i'll go with that!
    Stéphan Taljaard

    Stéphan Taljaard

    2 months ago
    Hey @Jessica Smith. I also had a similar use-case in the past. I wanted to do Prefect-
    .map()
    task runs on a list of len >100, each using the SQLAlchemy eninge. Though that can be flaky, if engine breaks, many mapped task run's can't auto-recover. Curious what are your thoughts on such a use-case?
    Jessica Smith

    Jessica Smith

    1 month ago
    Hey @Stéphan Taljaard, I think you sent this message right as I was switching jobs to a job that doesn't use Prefect. I'm back on the Slack now, but I doubt you still are looking for an answer for this. If you are, let me know, happy to give it some thought