z

    Zviri

    2 years ago
    hey everyone, I noticed very high memory consumption when using mapped tasks in conjunction with the 
    CloudTaskRunner
     but not the plain 
    TaskRunner
     (using Dask Deployment). I was observing that during the "mapping" procedure the worker that was doing the actual mapping was continuously using more and more memory. Which seemed reasonable since mapping constitutes copying the mapped task. However, I noticed that when using the
    CloudTaskRunner
    memory consumption is much much higher during this step. To be specific, mapping from a list that only contained approximately 8000 elements has eaten up more than 4 GB of memory on the worker. I did some debugging and found out that the same mapped task has a serialized size of 15 200 bytes using
    TaskRunner
    , but 122 648 bytes using the
    CloudTaskRunner
    . This is almost a 10 fold increase which makes the mapping function pretty unusable for me. The increased size is ultimately coming from pickling this function: https://github.com/PrefectHQ/prefect/blob/master/src/prefect/engine/task_runner.py#L788 and I think the serialized size of the
    CloudTaskRunner
    class is the cause of the different sizes. Is this behavior something that is known? Or is it worth a bug report? I will stick to the plain
    TaskRunner
    for now which will, unfortunately, prevent me from using the cloud UI which I really like. It would be great if this could be fixed. I'm using the latest prefect (v 0.10.7)
    j

    josh

    2 years ago
    IIRC @Chris White looking into sizing at some point so tagging him in this thread for visibility. Also cc @Jim Crist-Harif for the Dask clout
    Jim Crist-Harif

    Jim Crist-Harif

    2 years ago
    Yeah, there's some inefficiencies here. We could do some things to fix this now, but a lot of the issues here will go away with the upcoming mapping refactor that Chris is working on. Once that's done I plan to revisit the serialized size issue.
    z

    Zviri

    2 years ago
    @Jim Crist-Harif understood and thank you for the response. When do you plan to release this refactor? I will keep an eye on it and try to retest it then. I think I can manage without the UI for the time being.
    Chris White

    Chris White

    2 years ago
    The refactor is on me; right now I’m focused on our 0.11.0 release (to go out next week) but the minute that work is complete I’m going to switch to the mapping refactor
    z

    Zviri

    2 years ago
    cool, I will keep eye on the updates then and wait patiently 😃