• e

    Evan Crook

    1 year ago
    hello all! I'm encountering a weird condition that seems like a bug, or maybe I'm just not understanding it properly. (Minimal example in thread) so essentially we want to run a pipeline of
    extract -> map over might_fail -> map over show
    and any failed mapped
    might_fail
    instance passes its result to
    handle_error
    .
    might_fail
    also includes retries. However what I'm seeing is: the task
    might_fail[3]
    starts its task run, hits a
    RuntimeError
    , enters a
    Retrying
    state, and then finishes its task run with state
    Retrying
    . It does not do any retries. Then the rest of the mapped tasks finish and we proceed to handle_failure; and
    handle_failure[3]
    says
    "Not all upstream states are finished; ending run."
    (because the upstream state is
    Retrying
    , not
    Failed
    ) and then
    Finished task run for task with final state: 'Pending'
    . So
    handle_failure
    never actually runs at all. If I take out retries on
    might_fail
    it works as expected. But in the real-world example this is mimicking,
    might_fail
    hits an API prone to rate limits / transient failures so we actually want to retry it, and then trigger
    handle_failure
    , only if it's entered a failed state after retrying a few times. Does this make sense? Is this a bug or am I just doing something terribly wrong? (happy to provide logs from running this minimal example, too, if it's helpful) thanks so much in advance! 🌟
    e
    Kevin Kho
    +1
    15 replies
    Copy to Clipboard
  • Sumit Kumar Rai

    Sumit Kumar Rai

    1 year ago
    I've set a Prefect Cloud secret, but I get error as in the screenshot. The prefect doc says "Secrets are resolved locally first, falling back to Prefect Cloud (if supported)". What am I missing? I'm using below command to retrieve the secret.
    from prefect.client import Secret
    
    GITHUB_ACCESS_TOKEN = Secret("GITHUB_ACCESS_TOKEN").get()
    Sumit Kumar Rai
    Kevin Kho
    +1
    10 replies
    Copy to Clipboard
  • g

    g.suijker

    1 year ago
    Hi all! Since today I'm getting the following errors when building the flow's docker storage, yesterday it all worked fine. Any ideas why I'm getting this error and how to solve it?
    E: Failed to fetch <https://packages.microsoft.com/debian/10/prod/pool/main/u/unixodbc/odbcinst_2.3.7_amd64.deb>  404  Not Found [IP: 104.214.230.139 443]
    E: Failed to fetch <https://packages.microsoft.com/debian/10/prod/pool/main/u/unixodbc/unixodbc-dev_2.3.7_amd64.deb>  404  Not Found [IP: 104.214.230.139 443]
    E: Failed to fetch <https://packages.microsoft.com/debian/10/prod/pool/main/u/unixodbc/odbcinst1debian2_2.3.7_amd64.deb>  404  Not Found [IP: 104.214.230.139 443]
    E: Failed to fetch <https://packages.microsoft.com/debian/10/prod/pool/main/u/unixodbc/libodbc1_2.3.7_amd64.deb>  404  Not Found [IP: 104.214.230.139 443]
    E: Failed to fetch <https://packages.microsoft.com/debian/10/prod/pool/main/u/unixodbc/unixodbc_2.3.7_amd64.deb>  404  Not Found [IP: 104.214.230.139 443]
    E: Failed to fetch <https://packages.microsoft.com/debian/10/prod/pool/main/m/msodbcsql17/msodbcsql17_17.7.2.1-1_amd64.deb>  404  Not Found [IP: 104.214.230.139 443]
    E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
    I'm using a custom docker file as provided here https://docs.prefect.io/orchestration/recipes/configuring_storage.html:
    FROM prefecthq/prefect:0.14.10-python3.8
    
    # install some base utilities
    RUN apt update && apt install build-essential -y build-essential unixodbc-dev && rm -rf /var/lib/apt/lists/*
    RUN apt-get update && apt-get install curl -y
    
    # install mssql-tools
    RUN curl <https://packages.microsoft.com/keys/microsoft.asc> | apt-key add -
    RUN curl <https://packages.microsoft.com/config/debian/10/prod.list> > /etc/apt/sources.list.d/mssql-release.list
    RUN apt-get update && ACCEPT_EULA=Y apt-get install msodbcsql17 -y
    RUN ACCEPT_EULA=Y apt-get install mssql-tools -y
    
    # update bash configuration
    RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
    RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
    
    # update OpenSSL configuration file
    RUN sed -i 's/TLSv1\.2/TLSv1.0/g' /etc/ssl/openssl.cnf
    RUN sed -i 's/DEFAULT@SECLEVEL=2/DEFAULT@SECLEVEL=1/g' /etc/ssl/openssl.cnf
    g
    2 replies
    Copy to Clipboard
  • Sumit Kumar Rai

    Sumit Kumar Rai

    1 year ago
    I have a
    pipelinewise
    command as a shell task in a prefect flow. The command takes config files and a state file as an input parameter. The command executes and updates the state file at the end. What is the best way to retain the state file and also read the latest state file before executing the task?
    Sumit Kumar Rai
    Kevin Kho
    3 replies
    Copy to Clipboard
  • f

    Florian Kühnlenz

    1 year ago
    Hi everyone, is it possible to set up a flow sla for a group of flows? In the UI it seems only possible for one flow at a time.
    f
    1 replies
    Copy to Clipboard
  • Stéphan Taljaard

    Stéphan Taljaard

    1 year ago
    Hi. Is there a way to create a Secret dynamically, as a step in a flow? I submitted a PR for SendGrid.SendEmail some time ago. I based it on other tasks in the task library, so I had the secret name passed as an argument, instead of the secret value. Now I have a GCP Secret named
    shared_credentials
    with contents
    {"sendgrid-api-key": "abc", "some-other-api-key": "efg", ...}
    , Is there a way to read the
    sendgrid-api-key
    value from
    shared_credentials
    , then create a temporary secret in the flow named
    SENDGRID_API_KEY
    , as required by the SendGrid task? Or other ideas to use the value inside of the secret?
    Stéphan Taljaard
    Kevin Kho
    4 replies
    Copy to Clipboard
  • Domantas

    Domantas

    1 year ago
    Hello Prefect, I'm getting an error
    TypeError: cannot pickle '_thread.lock' object
    (I'll paste full error code in the comments) and I'm out of ideas how to properly solve it. It is related with
    DaskExecutor
    and it appears when trying to proceed this task(I'll upload code sample in the comments):1. Read file content from a yml file which is located in the S3 storage. For file reading I'm using raw
    boto3
    implementation. 2. Read bytes from the downloded yml file 3. Load yml file and convert it into list Does anyone knows a solution to this problem?
    Domantas
    Amanda Wee
    4 replies
    Copy to Clipboard
  • Ben Muller

    Ben Muller

    1 year ago
    Hey prefect peeps, if a flow fails midway (say task 3 of 5) are we able to retry that task in order to get the flow to succeed or do we need to retry the entire flow from the beginning again?
    Ben Muller
    1 replies
    Copy to Clipboard
  • ciaran

    ciaran

    1 year ago
    Eyo 👋 I'm trying to get fancy by specifying both
    pod_template
    and
    scheduler_pod_template
    in my
    dask_kubernetes.KubeCluster
    DaskExecutor
    . Both the templates (for now) are exactly the same:
    "pod_template": make_pod_spec(
        image=os.environ["BAKERY_IMAGE"],
        labels={"flow": flow_name},
        env={
            "AZURE_STORAGE_CONNECTION_STRING": os.environ[
                "FLOW_STORAGE_CONNECTION_STRING"
            ]
        },
    ),
    "scheduler_pod_template": make_pod_spec(
        image=os.environ["BAKERY_IMAGE"],
        labels={"flow": flow_name},
        env={
            "AZURE_STORAGE_CONNECTION_STRING": os.environ[
                "FLOW_STORAGE_CONNECTION_STRING"
            ]
        },
    ),
    If I try to run a flow with both declared, my Dask Scheduler pod fails with:
    Traceback (most recent call last):  File "/srv/conda/envs/notebook/bin/dask-worker", line 8, in <module>
        sys.exit(go())
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 462, in go
        main()
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
        return self.main(*args, **kwargs)
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 1062, in main
        rv = self.invoke(ctx)
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 763, in invoke
        return __callback(*args, **kwargs)
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 406, in main
        nannies = [
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 407, in <listcomp>
        t(
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 220, in __init__
        host = get_ip(get_address_host(self.scheduler.address))
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/addressing.py", line 142, in get_address_host
        return backend.get_address_host(loc)
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/tcp.py", line 572, in get_address_host
        return parse_host_port(loc)[0]
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/addressing.py", line 90, in parse_host_port
        port = _default()
      File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/addressing.py", line 69, in _default
        raise ValueError("missing port number in address %r" % (address,))
    ValueError: missing port number in address '$(DASK_SCHEDULER_ADDRESS)'
    But if I only declare the
    pod_template
    , everything works out great. I'm assuming the fact I declare the
    scheduler_pod_template
    means I'm losing some default setup somewhere down the line?
  • j

    Joe

    1 year ago
    Hello world! Have any of you ever encountered this when trying to connect an agent to a remote server?
    Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
    j
    Kevin Kho
    10 replies
    Copy to Clipboard