Hey Team, When using GitHub Storage + Google Clou...
# ask-community
c
Hey Team, When using GitHub Storage + Google Cloud Run, does prefect only store the single
flow.py
file and not the whole repository when executing it in a container? Would this mean that if i have dependencies in other folders within the repo i’d have to bake that into the image im using in google cloud run? I’m getting a Module Not Found error when trying to import my own modules within the same repo and it seems like this might be the reason?
1
k
With git-based storage,
git clone
is run, so your entire repo is copied into the running container. There may be something going on with relative imports based on how you're running the flow when testing it locally. I would make sure when testing that you're in the root of your repo when executing the script, writing out the path to your flow file the same way you would specify in the
entrypoint
field of your deployment.
c
@Kevin Grismore ahhh okay, thanks for the response- in terms of my folder structure, it is set up like this with init.py files:
Copy code
Pipelines/
|
|-- src/
|   |-- flows/
|   |   |--- flow_1/
|   |   |------- flow_1.py, flow_1.deployment.py
|   |   |--- flow_2/
|   |   |------- flow_2.py, flow_2.deployment.py
|   |   |...
|   |-- helpers/
|   |---- helper.py ..
|-- requirements
|-- setup.py
|-- README
when testing locally, does it matter where you run the deployment from? For example, running
prefect deploy
from
src/
vs running
prefect deploy
from
src/flows/flow_<num>
or running the
deployment.py
files inside the respective flow folders?
k
I guess I'm not clear on what the difference is between
flow_1.py
and
flow_1.deployment.py
is. When you create a deployment, you specify an
entrypoint
, so if we were in the root of your repo, it'd look like
src/flows/flow_1/flow_1.py:<your_flow_function>
, assuming
flow_1.py
is the script that has the flow you want to deploy in it.
By testing locally, I mean executing the script with your flow function in it, like
python src/flows/flow_1/flow_1.py
c
@Kevin Grismore, I dont use the .py files anymore to build deployments but it may have been an older way (or incorrect) of creating a deployment where inside that file you’d have something like:
Copy code
from prefect.deployments import Deployment
from flow import flow_function

deployment = Deployment.build_from_flow(
    flow=flow_function,
    name="ad-hoc",
    description=description,
    infra_overrides={"env": {"PREFECT_LOGGING_LEVEL": "DEBUG"}},
    version="1.0",
    work_queue_name="dev",
    parameters={...},
    tags=["dev"],
)

if __name__ == "__main__":
    deployment.apply()
I now run
prefect deploy
in the root of the repo with the entrypoint like you showed above.
k
ah, gotcha
so yeah, try
python src/flows/flow_1/flow_1.py
to run your flow locally from the root of your repo and see if you're getting the same import errors
c
okok- got it. Will test this out- thanks! @Kevin Grismore I’ve got a bit of a separate question if you don’t mind- when using google cloud run- is there a way to specify a google cloud sql connection via prefect as well? One issue im running into is not being able to connect the cloud run instances that are spun to execute flows to a cloud sql instance i have up and running as well. There seems to be configuration settings in the cloud run push pool that allow you to specify VPC but not for a potential cloud SQL connections.
k
ahhh a familiar scenario!
c
On a lot of the flows, the data processing and transformations being done need to be pushed to the database (also in google, via cloud sql).
k
was just doing this about a month ago
c
!!! hopefully came to a friendly conclusion? 🤞 haha.
k
so, for this I'd recommend using
prefect-sqlalchemy
you can make a SqlAlchemy connector block with the connection info to whichever db you're using in cloud sql
you'll still need to specify the VPC in cloud run, since you need to be inside the network to connect to a cloud sql db, but from there sqlalchemy should work just as expected
c
Ahh yes, i’m actually using this at the moment. And gotcha…one more layer of possible confusion, is this all under the assumption that the prefect server, cloud run jobs, and cloud sql instance is all under the same project in gcp? my prefect related components (server VM, cloud run jobs) is in one project but the cloud sql instance in another…
k
in that case you might need a peering setup, or maybe a VPN, but that's a whole other deal
I think there's another way to look at it
c
thought so…so i think what can resolve that is if the cloud run jobs can be under the same project as the cloud sql instance in that case….?
k
having the cloud run jobs executing in the.... yep that's exactly what I was going to say
c
!! time to test that out.
🙌 1
Hey @Kevin Grismore, i’m running into an issue when trying to use the
vpc connector name
attribute in the cloud run pool. It seems like it may no longer be a supported feature?
Copy code
Annotation '<http://run.googleapis.com/vpc-access-connector|run.googleapis.com/vpc-access-connector>' is not supported on resources of kind 'Service'. Supported kinds are: Revision, Execution
https://stackoverflow.com/questions/66098691/deploying-cloud-run-service-fails-because-of-vpc-connector-annotation
k
oop I remember running into this too. I believe to get this working I had to make a few edits to the work pool config under the advanced tab. I'm not at home to check right now but I'll send you an example as soon as I can
c
ahh okok! Sounds good. Thanks! @Kevin Grismore
just a follow up on this. The yaml file needed to be edited as the default yaml file had the
vpc-access-connector
annotation in the wrong location. It needed to be moved to template’s metadata. I’ve followed here as a comment: https://github.com/PrefectHQ/prefect-gcp/pull/172#issuecomment-1763689322 and submitted an issue: https://github.com/PrefectHQ/prefect-gcp/issues/218
1
🙌 1
k
thanks for digging into this! sorry I wasn't able to get to it over the weekend
c
No prob!!