• b

    Ben

    1 year ago
    I think I want to run different tasks with slightly different Kubernetes templates in one run, e.g. if machine resources could vary between tasks (GPU or not) or each task has a different entrypoint in the kubernetes template. Is using dependent flows the way to go there? I assume each dependent flow could have its own
    run_config
    then. Is that use case normal or am I thinking about it wrong? Just to double check that I'm not attempting anything totally unidiomatic here.
    b
    Kyle Moon-Wright
    2 replies
    Copy to Clipboard
  • m

    matta

    1 year ago
    Heya! Question from co-worker Let's say I wanted to run multiple dbt commands through the dbt task, but will not know the commands until runtime (they will be received as a paramter). What is the recommended way of looping tasks? Would something like this be ok?
    m
    1 replies
    Copy to Clipboard
  • m

    matta

    1 year ago
    dbt_task = DbtShellTask(
      ...
    )
    with Flow("dbt-flow") as flow:
        dbt_command_string = Parameter('dbt_command_string', default = 'dbt run', required = True)
        dbt_commands = dbt_command_string.split(',')
        for command in dbt_commands:
          dbt_task(
              command=dbt_command,
              ...
          )
  • m

    matta

    1 year ago
    It also needs to be done sequentially. Is there an argument to call to
    map
    that forces it to run serially?
  • m

    matta

    1 year ago
    in the dbt slack he was told to not just loop it and to use mapping
  • m

    matta

    1 year ago
    Thanks!
  • John Grubb

    John Grubb

    1 year ago
    Hey team, I have two pipelines that I'm running using the Cloud for orchestration. More often than I'd like I wake up the next morning to find that one of them has failed with
    Unexpected error while running flow: KeyError('Task slug foo-3 not found in the current Flow; this is usually caused by changing the Flow without reregistering it with the Prefect API.')
    It's always the same task for both pipelines - I've split out a common task to load data to BigQuery into another file and this is the task that fails. It will run fine for a few days and then randomly decide that it can't find this task, even though the exact same task is called two other times during each pipeline, foo-1 and foo-2. I'm wondering if I'm the first person to have their install randomly lose tasks..
    John Grubb
    Kyle Moon-Wright
    +2
    14 replies
    Copy to Clipboard
  • j

    josh

    1 year ago
    Hey team, Prefect version 
    0.13.18
     has been released and here are a few notable changes:   ⁉️   Formatted GraphQL errors are now raised by the Client   👟   Refactored a few Client API calls for performance   📚   Various enhancements and fixes to tasks in the Task Library A big thank you to our contributors who helped out with this release! Full changelog https://github.com/PrefectHQ/prefect/releases/tag/0.13.18
  • Sam Luen-English

    Sam Luen-English

    1 year ago
    Hi there, I am trying to deploy a flow and I am running into some issues... I have some complex dependencies and so want to use a custom base docker image. I've tried to create a simple example illustrating what I want to do. - In this example, I have a base docker image which has sqlalchemy installed (this is a placeholder for my own library). - I always develop inside docker containers and don’t want to install this library locally. This is also useful so I can easily deploy in CI. I’ve tried two approaches:- Running the main.py script on my host (but this obviously fails due to an import error). - Use the base image to run main.py, mounting the docker socket. This fails with the message “ValueError: Your docker image failed to build! Your flow might have failed one of its deployment health checks - please ensure that all necessary files and dependencies have been included”, even though in the output I can see that all the health checks pass. A few other thoughts I had:- Perhaps I could just move the imports inside the task functions - but this doesn’t seem nice to me. - Perhaps I should use a docker task instead - but this seems overkill and i believe this is the exact use case of the base image. Dockerfile:
    FROM python:3.8
    RUN python -m pip install sqlalchemy prefect
    main.py
    from prefect import Flow, task
    from prefect.environments.storage import Docker
    import sqlalchemy
    
    
    @task
    def some_task():
        print(sqlalchemy.__version__)
    
    
    with Flow("Some Flow") as flow:
        some_task()
    
    if __name__ == "__main__":
        flow.storage = Docker(
            registry_url="<http://docker.io|docker.io>",
            image_name="storage_image",
            image_tag="latest",
            base_image="test-base",
            local_image=True,
        )
        flow.register(project_name="Development")
    Test script:
    #!/usr/bin/env bash
    set -x
    
    docker build . -t test-base
    
    #: Approach 1
    python main.py
    
    #: Approach 2
    docker run -v /var/run/docker.sock:/var/run/docker.sock -v ~/.docker:/root/.docker -e PREFECT__CLOUD__AUTH_TOKEN="$PREFECT__CLOUD__AUTH_TOKEN" -v $(pwd)/main.py:/main.py test-base bash -c 'python /main.py'
    Please could you guide me with the best approach? Many thanks!
    Sam Luen-English
    Kyle Moon-Wright
    +3
    9 replies
    Copy to Clipboard