Hi all, I am brand new to Prefect and recently sta...
# prefect-cloud
b
Hi all, I am brand new to Prefect and recently started a cloud trial. I have a secrets question that I am hoping someone could help with. We currently store all our secrets in AWS Secrets Manager. I have created a AWS Secret Block that uses an AWS Credential block for access. (see attached screenshot) Based on the screenshot my assumption is that the AWS Secret block has the credentials required and the secret name to retrieve and therefor in my task or flow I should only need to load the block and read the data. Below is my basic example of that:
Copy code
from prefect import flow
from prefect_aws.secrets_manager import AwsSecret

secret_manager = AwsSecret.load(name="my-secret")

@flow(log_prints=True) 
def example_read_secret(): 
    print(secret_manager.read_secret())

example_read_secret()
The secret content is in JSON format but yet I get an error when I try to print the output of
KeyError: 'SecretBinary'
Has anyone experienced this or know what could be wrong?
1
Based on the research I have done, I believe there is a bug in the AWS Secret block. According the AWS documentation on the usage of get_secret_value there should be a check to see if "SecretString" is None before return "SecretBinary".
Copy code
SecretBinary
The decrypted secret value, if the secret value was originally provided as binary data in the form of a byte array. The response parameter represents the binary data as a base64-encoded string.

If the secret was created by using the Secrets Manager console, or if the secret value was originally provided as a string, then this field is omitted. The secret value appears in SecretString instead.

Type: Base64-encoded binary data object

Length Constraints: Minimum length of 1. Maximum length of 65536.
In our case, our secret value was created in the AWS console, which would mean what should be returned is the SecretString and not the SecretBinary. The example in the Boto3 documentation spells out this logic. https://boto3.amazonaws.com/v1/documentation/api/latest/guide/secrets-manager.html#example Does this seem accurate to others?
Another update... After digging deeper I found that this exact issue was reported in the GitHub repo (https://github.com/PrefectHQ/prefect-aws/issues/265). It appears the change was made and PR created but failed Windows Tests and was never merged. Unfortunately for me/us, that was 2.5 months ago. So, who knows if it will ever make it to the consumers. 🤷‍♂️
n
hi @Bryan - we just discussed this issue / PR today going over backlog, the original contributor seems to have abandoned the PR but we will pick up and get it cleaned up / merged asap
🙏 1
b
Hey @Nate, that would be awesome. It will definitely make our situation much easier as we would be using that block in virtually every flow we create. Thanks for the update!
n
👍
b
As an FYI if it helps... We are also running into the same issue mentioned in the following thread after adding the prefect-fivetran module to our project. https://prefect-community.slack.com/archives/CL09KU1K7/p1697777918762979 Is this because we have code to manually pull the AWS secret using boto3 which is conflicting with what is used in the prefect-fivetran module? Just curious if they are unrelated issues, Thanks again!
n
i believe this is an unrelated issue related to pydantic v2 - I suspect the most direct fix is to pin
pydantic<2
in your project, but if you could share your trace you get that would be appreciated
b
n
yep i think pinning pydantic under 2 should be a workaround for you
👍 1
b
Excellent. Thanks for the confirmation
v
@Mitch Zink Check this out!
b
@Nate, I just got back to attempting some more testing. I believe moving pydantic back to v1 causes other issues when trying to leverage the prefect-fivetran.connectors.trigger_fivetran_connector_sync_and_wait_for_completion function. The code I am using and the results of that are in the attached screenshots. Not sure if this is helpful or not. I believe I am stuck for now unless the issue is completely unrelated.
@Nate and anyone else that runs into this issue. This is resolved in the late breaking prefect version 2.14.2 which also fixes having to pin pydantic to v1. Much appreciated!!
🙌 1
n
thanks for the update @Bryan! glad your issue is resolved
b
@Nate, I know this is an older thread by now but I have a follow up to this for a different configuration. It appears that although 2.14.2 fixed the issue I was having with the prefect-fivetran module and it's dependent pydantic module while running locally, there seems to be an issue in a ECS-Push configuration. I can run it locally fine and without issues. When I setup an ECS-Push deployment and run the flow I get the following error back:
Copy code
Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/opt/prefect/dw_prefect_hubspot-dev/hubspot_pipeline.py", line 2, in <module>
    from prefect_fivetran import FivetranCredentials
  File "/usr/local/lib/python3.10/site-packages/prefect_fivetran/__init__.py", line 2, in <module>
    from .credentials import FivetranCredentials
  File "/usr/local/lib/python3.10/site-packages/prefect_fivetran/credentials.py", line 8, in <module>
    class FivetranCredentials(Block):
  File "/usr/local/lib/python3.10/site-packages/pydantic/v1/main.py", line 197, in __new__
  File "/usr/local/lib/python3.10/site-packages/pydantic/v1/fields.py", line 506, in infer
  File "/usr/local/lib/python3.10/site-packages/pydantic/v1/fields.py", line 436, in __init__
  File "/usr/local/lib/python3.10/site-packages/pydantic/v1/fields.py", line 557, in prepare
  File "/usr/local/lib/python3.10/site-packages/pydantic/v1/fields.py", line 831, in populate_validators
  File "/usr/local/lib/python3.10/site-packages/pydantic/v1/validators.py", line 765, in find_validators
RuntimeError: no validator found for <class 'pydantic.types.SecretStr'>, see `arbitrary_types_allowed` in Config

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 406, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/utilities.py", line 51, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/deployments.py", line 254, in load_flow_from_flow_run
    flow = await run_sync_in_worker_thread(load_flow_from_entrypoint, str(import_path))
  File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/prefect/flows.py", line 1498, in load_flow_from_entrypoint
    flow = import_object(entrypoint)
  File "/usr/local/lib/python3.10/site-packages/prefect/utilities/importtools.py", line 201, in import_object
    module = load_script_as_module(script_path)
  File "/usr/local/lib/python3.10/site-packages/prefect/utilities/importtools.py", line 164, in load_script_as_module
    raise ScriptError(user_exc=exc, path=path) from exc
prefect.exceptions.ScriptError: Script at 'hubspot_pipeline.py' encountered an exception: RuntimeError("no validator found for <class 'pydantic.types.SecretStr'>, see `arbitrary_types_allowed` in Config")
I have upgrade to prefect 2.14.4 and my deployment looks like what I have below if that helps at all:
Copy code
deployments:
  - name: dev-deployment
    version:
    tags: []
    description: ECS-Push Deployment
    schedule:
    entrypoint: .\main.py:someflow
    parameters: {}
    pull:
      - prefect.deployments.steps.git_clone:
          id: clone-step
          repository: <https://github.com/my-repos/some-repo.git>
          branch: "dev"
          credentials: "{{ prefect.blocks.github-credentials.github-creds }}"
      - prefect.deployments.steps.pip_install_requirements:
          directory: "{{ clone-step.directory }}"
          requirements_file: requirements.txt
    work_pool:
      name: Ecs-push
      work_queue_name:
      job_variables: {}
The only thing I can think of is that the default ECS image being used (Note: I am specifying one therefore using the default), has some conflicting modules or modules that are superseding what's in my requirements.txt. Any thoughts?
n
are you pinning pydantic < 2 in your requirements.txt?
i could see how if you're pip installing stuff in your pull step, you could be installing pydantic 2 right before runtime and causing this weirdness unfortunately here, we dont have direct control over the
prefect-fivetran
collection as fivetran owns the repo i took a quick look at the code and its a pretty simple http client that honestly would be really easy to rip out and use as desired if you didnt want the overhead of installing the collection / dealing with their pins
b
I am currently not installing anything related to pydantic in my requirements.txt. This is what my requirements.txt file has in it:
Copy code
prefect
prefect-aws
prefect-fivetran
prefect-dbt
Where ever pydantic is getting installed it's not by me explicitly
n
yep, but prefect will install pydantic (most recent) if you dont pin it yourself, try adding
pydantic<2
in your reqs file
b
I have tried that as well and get an error. Let me go grab that.
Just ran it again with pydantic<2 at the end of my requirements.txt file and got the same error.
As a little background, yesterday I tried putting pydantic at the beginning and end of the requirements.txt file and it didn't make any difference on the error. I also tried specifying the exact version of pydantic==1.10.0 with the same results
n
you're saying you get the same trace as above? that would mean you're not actually installing pydantic<2, since the v1 module in pydantic only exists in pydantic v2
is it possible the requirements file you have in your github isnt updated? like you changed it locally but didnt push?
b
That's what I was originally thinking but it looks to be accurate. This is what is in Github:
And to answer your question about getting the same trace as above, yes, that is correct.
n
hmm okay, I'd be curious to see your worker logs during the execution of the that pip install requirements pull step you have, it seems one way or another, your runtime is ending up with pydantic 2
b
I had the same thoughts. It sounds like there is a way to turn those logs on in an ECS-push scenario but I haven't spent any time doing that. Is there documentation that would show how to get logging to work on the ECS task that gets spawned?
n
ahh yes ecs i forgot, i’m not the worlds biggest ecs buff so forgive me if i’m wrong but my prior was that a container named like “adjective-animal” would pop up for your ecs task that you could check the logs on, is that accurate?
b
That's exactly where they would be. The screenshot shows what I always see in that screen. I read somewhere that you have to turn on CloudWatch logs. I don't know if that's accurate or not.
n
ahh yeah true, sorry about the back and forth. do you not see any “running pull step” logs higher up in that trace you shared earlier?
b
No worries. What I posted earlier is everything I see in the logs. It definitely would be nice to see everything as I believe there is really more there but Prefect must only be getting the last error in the log.
n
gotcha, yeah i would expect to see some “running pull steps” logs near the top of the flow run logs 🤔 which iirc we now should even show in our UI on that flow run’s logs - weird that you’re not seeing that
👍 1
b
It just dawned on me. I can set an image to use for the work pool. Is there an older image that's out there that uses pydantic v1?
n
i don’t believe we publish separate images for that but you could try specifying prefecthq/prefect:2.13.4-python3.10 (or whatever minor python version) which would use an image from before we unpinned pydantic
👍 1
it’s still weird tho bc the pip install should happen on top of whatever image you’re using, which i would expect reinstalls pydantic with the desired version 🤔
b
That's exactly what I was expecting as well
n
hmm yeah if using the older image works then maybe there’s something wrong with the pull step
wait what command did you use to deploy this flow?
b
prefect deploy --all
👍 1
n
gah okay nevermind
b
It's working with that older image!
👍 1
n
hmmm okay, in the UI for this deployment do you see the pull steps in the configuration tab?
b
Maybe I'm in the wrong area but I don't see a configurations tab. In the UI I went to deployments, clicked the menu to the right of the deployment, then edit
n
are you on cloud or prefect server?
b
Cloud
n
ok so on the deployments page (left nav) if you click into your deployment, there should be a config tab like this
b
HAHAH! Amateur over here. Yes, I see the pull steps
Copy code
[
  {
    "prefect.deployments.steps.git_clone": {
      "id": "clone-step",
      "branch": "dev",
      "repository": "<https://github.com/some_org/prefect_hubspot.git>",
      "credentials": "{{ prefect.blocks.github-credentials.github-creds }}"
    }
  },
  {
    "prefect.deployments.steps.pip_install_requirements": {
      "directory": "{{ clone-step.directory }}",
      "requirements_file": "requirements.txt"
    }
  }
]
n
okay that looks fine, interesting - i was wondering if there was some issue with the pull steps definition such that they weren’t be recorded on the deployment itself. so it seems that somehow your pull steps are not executing correctly. do you happen to have the command handy you used to start your ecs worker service?
it should be in the task definition
b
n
oh yeah this is a push pool correct? this may be a bug of some kind
b
Correct
n
i can look more at this tomorrow, something seems off though
👍 1
b
No worries. I am about done for the day anyway
Hey @Nate, I was able to get CloudWatch logging wired up for the ECS Tasks. I switch my work pool back to use the latest image to test the flow-run and get more information about the error. I now have the logs for the tasks and can see that logs leading up to the error. The only thing that sticks out to me in the attached logs which may be nothing is the uninstall of pydantic 2.5.2 right before the execution of the flow. Any thoughts on that?
j
Hi @Bryan, did you ever solve this problem? I believe we have a similar setup to yours using ECS and our only additional packages being:
Copy code
prefect
prefect-aws
prefect-fivetran
prefect-dbt
Our deployment just started getting the error `RuntimeError: no validator found for <class 'pydantic.types.SecretStr'>, see
arbitrary_types_allowed
in Config` yesterday
b
Hi @James, we did resolve it by pinning the pydantic version to < 2 during the build step where we are building the image. We are using a GitHub Action to build and deploy. This builds the image to be used when running tasks in ECS and publishes the image to our own private repo in ECR. So, the deployment references the image in ECR upon push to ECS. Hopefully that helps.
thank you 1
👀 1
j
Thanks! Good to know that
pydantic<2
is the key to fix it, though our deployment setup is a bit different and I have not yet found the right place to insert it that fixes the error. I will take a look and see if I can get it to work
👍 1
We use Pulumi to spin up our AWS resources... maybe it has a similar way to update the image with the proper package version
👍 1
b
For a little more info... It appears the real issue is a pydantic dependency conflict in the prefect-fivetran module. That particular module uses version <2 where perfect has been updated to use pydantic =>2. I believe the issue comes with the order of pip install -r requirements.txt where pydantic=>2 is ran last. That's just my guess.
👍 1
I have created an issue in the GitHub repo for prefect-fivetran to be updated to a newer pydantic version in the module but never got a response back. Apparently, someone at Fivetran created it and it appears it's not monitored.
🙌 1
j
Hmmm... I could try reaching out to our Fivetran sales team or open a support ticket with them
b
That would be great. I was going to do the same but ran out of time.
j
To build your modified image, would you happen to be using a Dockerfile with something like this?
Copy code
FROM prefecthq/prefect:2.14.11-python3.10
...
RUN pip3 install prefect-fivetran
...
RUN pip3 install pydantic==1.10
b
Our Dockerfile looks like this:
Copy code
FROM prefecthq/prefect:2.14.6-python3.11
COPY . /opt/prefect/dw_prefect_project/
WORKDIR /opt/prefect/dw_prefect_project/
RUN pip install -r requirements.txt
👀 1
j
Thanks! And as long as the
prefect-fivetran
and
pydantic
are in the right order in the
requirements.txt
file, it works?
b
Yes, it seems to work fine on our end
👍 1
This is what our GitHub Action looks like if that's helpful:
Copy code
name: Prefect Deploy to Amazon ECS

on:
  workflow_call:

env:
  AWS_REGION: us-west-2 # This is our AWS region
  ECR_REPO_NAME: <http://9999999999.dkr.ecr.us-west-2.amazonaws.com/${{|9999999999.dkr.ecr.us-west-2.amazonaws.com/${{> github.event.repository.name }} # Save the ECR Repo path/URL to be used in the prefect.yaml
  GIT_REPO_NAME: ${{ github.event.repository.name }}
  GIT_REPO_URL: ${{ github.server_url }}/${{ github.repository}}.git # Save the Git Repo URL to be used in the prefect.yaml file
  GIT_REPO_BRANCH: ${{ github.ref_name }} # Save the Git Repo Branch to be used in the prefect.yaml file

jobs:
  deploy:
    name: Deploy
    runs-on: ubuntu-latest
    # environment: development

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@0e613a0980cbf65ed5b322eb7a1e075d28913a83
        with:
          aws-access-key-id: ${{ secrets.GENDEV_DEPLOY_USER_AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.GENDEV_DEPLOY_USER_AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.AWS_REGION }}

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@62f4f872db3836360b72999f4b87f1ff13310f3a

      - uses: int128/create-ecr-repository-action@v1
        with:
          repository: ${{ env.GIT_REPO_NAME }}
          public: false

      - name: Install Python Requirements
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt

      - name: Set Prefect Workspace
        env:
          PREFECT_API_KEY: ${{ secrets.PREFECT_API_KEY }}
          PREFECT_API_URL: ${{ secrets.PREFECT_API_URL }}
        run: prefect cloud workspace set --workspace "workspace36-lesath/some-workspace"

      - name: Prefect Deploy
        env:
          PREFECT_API_KEY: ${{ secrets.PREFECT_API_KEY }}
          PREFECT_API_URL: ${{ secrets.PREFECT_API_URL }}
        run: prefect deploy --name ${{ env.GIT_REPO_BRANCH }}-${{ env.GIT_REPO_NAME }}
👀 1
j
Got it working on our side! After a bit of tinkering, we found that it is enough to set the ECS Task Definition image to
prefecthq/prefect:2.13.4-python3.10
and then add
pydantic==1.10.0
to the end of
env.EXTRA_PIP_PACKAGES
--it seems like that becomes the final step in the build. Note that we also had to add
prefecthq/prefect:2.13.4-python3.10
and
pydantic==1.10.0
to our deployment worker config too (based on the docs, this will override the configuration set elsewhere... so this step alone might be enough? It is a lot of configs to keep track of)
❤️ 2
b
@James, thanks for the update. Glad you got it working. Your info may come in handy for others.
@James, just a heads up. After some back and forth with my Fivetran rep and others at Fivetran, the (official?) statement I got from them is that the prefect-fivetran package will be deprecated and no longer supported on their end. So, we are on our own going forward. If you are interested, thanks to @Jack P 🙏 and his forked prefect-fivetran package repo. There is a pydantic 2 compatible version of prefect-fivetran. You can get it here https://github.com/japerry911/prefect-fivetran/tree/japerry911/imp/pydantic-2-compatible
🙌 1
j
Thanks for sharing!
👍 1