Hi all Does anyone have an example of how to use prefect to Prefect Community #ask-community

Hi all! Does anyone have an example of how to use ...

Akiva Wolberg

10/07/2021, 2:54 PM

Hi all! Does anyone have an example of how to use prefect to connect to an existing, external PostgreSQL database? (Mine also happens to require an SSH connection.) Two main things I'm struggling with: • can I return the result of

conn = psycopg2.connect()

from a task back to the flow, and then use it as an argument in another task? • if so, should I consider separating out the SSH and database connections into separate tasks? • using PrefectSecret appropriately --- I currently have a

.env

file and a task to get the environment variables and then use the results as a parameter in my other functions, but I suspect this isn't quite right.

Kevin Kho

10/07/2021, 3:02 PM

Hi @Akiva Wolberg, you can’t return the connection by default because Prefect requires task inputs and outputs to be serializable. This makes them thread-safe as you map across Dask. If you don’t use Dask, it can be done though by storing your flow as a script. More info for that is here . This way, it is evaluated at runtime and not serialized during Flow registration. If you do this, it can be split.

Kevin Kho

10/07/2021, 3:03 PM

If you are trying to load PrefectSecret from local secrets, it has to be in the environment variables before Prefect is imported. When Prefect is imported, the

context

is created from the environment variables and secrets are loaded inn .

context

is not really mutable after this.

Akiva Wolberg

10/07/2021, 3:11 PM

@Kevin Kho Thank you! It's helpful to know what I'm trying to do is not going to work. I'm lost after that unfortunately; I don't use Dask and I'm not sure what the linked docs have to do with keeping the connection(s) open. Do you know of any example code (e.g. for connecting to a db, executing a SQL query, and then disconnecting)?

Kevin Kho

10/07/2021, 3:13 PM

The linked doc shows how to not enforce the requirement of task outputs being serializeable, allowing you to return the client. It’s just not default behavior, but it can be done using script-based storage.

Kevin Kho

10/07/2021, 3:14 PM

We have the Postgres tasks that do that all in one go? Maybe they don’t fit your use case?

👍🏻 1

Akiva Wolberg

10/07/2021, 3:20 PM

@Kevin Kho Ah, I see. Postgres tasks does seem like the right approach. I'm not sure how to incorporate the SSH login info into the provided connection info, but at worst maybe I can do that with a separate script.

Kevin Kho

10/07/2021, 3:21 PM

You can largely copy the code behind there and edit it to fit your use case

12 Views

Open in Slack

Previous Next