Ken Nguyen

    Ken Nguyen

    6 months ago
    If I have a flow like :
    with Flow("flow", run_config=RUN_CONFIG, storage=STORAGE) as flow:
      json_data = get_json_data(
        url, query, headers,
        task_args={"name": "Getting Flow Data"}
      )
    How can I then access json_data as a python object, rather than a FunctionTask object?
    Kevin Kho

    Kevin Kho

    6 months ago
    You have to pass it in a task and access it inside a task
    Ken Nguyen

    Ken Nguyen

    6 months ago
    Could you clarify further? Let’s say the output of get_json_data() is a dataframe and I wanted to do:1. get the first row of the df as a variable a.
    first_row = json_data[0,]
    2. pass dataframe into the next task a.
    processed_df = process_dataframe(json_data, task_args={"name": "Processing df"})
    Kevin Kho

    Kevin Kho

    6 months ago
    Yeah so number 1 needs to happen inside a task because
    json_data
    is of type Task until execution time of the flow. The
    json_data[0,]
    syntax accesses the Task because it runs during build time, not run time. So you need to put this logic inside a Task so that it will defer to run time instead of build time. Does that make sense?
    The second one should work though right?
    Ken Nguyen

    Ken Nguyen

    6 months ago
    Ah I see! I shall test it out, thanks for the quick support
    Yep it now works, thank you for your help and explanation
    I do have a follow up question actually, I tried to display json_data in the logs by doing
    <http://logger.info|logger.info>(json_data)
    However, I don’t actually see anything get printed in the logs. Could you tell me why that’s happening?
    Do I need to make a task that does logging and use that task to print the output instead?
    Kevin Kho

    Kevin Kho

    6 months ago
    Yeah exactly. Logs in the flow block are executed during build time, not run time