https://prefect.io logo
t

Tom Shaffner

01/19/2022, 4:23 PM
Is there a way just to execute a git pull command at the start of a run, assuming the local machine has authentication already set up? My company uses Azure Devops, and as discussed in https://github.com/PrefectHQ/prefect/issues/4850, I set up a storage to pull from this via a hard-coded URL. Unfortunately the personal access token (PAT) in that URL keeps expiring in only 30 days (think I'm hitting some corporate policy limit). The standard Azure approach is to use the Git authentication manager https://docs.microsoft.com/en-us/azure/devops/repos/git/set-up-credential-managers?view=azure-devops, not SSH. I have that set up on the machines already, and a git pull would use that more reliably. Possible?
🕉️ 1
k

Kevin Kho

01/19/2022, 4:26 PM
I think there are two things here. The first one is how to get the Flow file, and the second if how to get the
.yaml
or
.sql
in the Git repo. I think for this specific case, you would have to use a different storage for the Flow like
AzureBlobStorage
. And then in the Flow state handler, you can clone the repo to get the dependencies. The repo can be cloned inside the state handler. We use the following code. You could use this but note it’s not a public facing API so it’s subject to change
t

Tom Shaffner

01/19/2022, 4:49 PM
Okay, the blog storage sounds like a potential solution, but if I'm understanding correctly I'd need to create a blob storage, create a pipeline that pushed git changes to that storage, and then I'd also need to use the temporary code above, which might change, to clone dependencies? I was with you up until that last step; I'm doing this via local runs, is there no way just to execute a git pull on the repo at the start and use that?
k

Kevin Kho

01/19/2022, 4:53 PM
The BlobStorage will only hold the Flow file, not the other stuff in Git. So you just need to call
flow.register()
and it will upload the flow to storage. Then yes to cloning dependencies
t

Tom Shaffner

01/19/2022, 4:54 PM
Hmm, honestly that sounds more complex than trying to switch to ssh. I take it the broader answer to my question about just pulling at the start is that there's no way to do that?
k

Kevin Kho

01/19/2022, 4:55 PM
You can pull at the start of the Flow. The state handler is the place to put any callable python code between state transitions. In this case, you just want to put the code to pull the repo as the Flow enters a Running state
31 Views