when i set my pull policy in my deployment yaml to...
# prefect-cloud
b
when i set my pull policy in my deployment yaml to be
Copy code
pull:
    - prefect.deployments.steps.git_clone:
        repository: <https://my_repo.git>
how does the actual repo get installed? ie into my VMs virtual environment?
n
hey @Ben Epstein in general your pull step will run wherever your worker / serve process is running > ie into my VMs virtual environment? from this I'd assume you're running a
ProcessWorker
? if so, by default the `git clone` will happen in a tmpdir unless you set a working_dir for that
Process
work pool
b
Yea you're right, i'm using a ProcessWorker. Sorry im not sure i understand
Copy code
in general your pull step will run wherever your worker / serve process is running
seems contradictory to
Copy code
by default the `git clone` will happen in a tmpdir
So in my VM i ran
Copy code
prefect worker start --pool "my-pool"
So when the next flow starts, and it does a git pull from my repo, will it end up right in my current working directory? Or will it be in a tmpdir?
oh, does the worker will change directory to a tempdir?
so in that case, will it install the github repo, for example as a python package?
n
the worker process starts in either a tempdir (default), or a specific working dir that you can set on the work pool and the
git_clone
step will just clone the repo, not necessarily install it as a package
although you could add another action to your
pull
step that would run some
pip install .
or something
b
although you could add another action to your
pull
step that would run some
pip install .
or something
oh interesting I didn't see this in the docs, this would be perfect! How would I do that exactly?
n
something like this
Copy code
pull:
    - prefect.deployments.steps.git_clone:
        repository: <https://my_repo.git>
    - prefect.deployments.steps.run_shell_script:
        script: cd my_repo && pip install .
@Marvin check my work
🙏 1
m
Here is how the
prefect.deployments.steps.git_clone
and
prefect.deployments.steps.run_shell_script
steps work within Prefect deployments: - `prefect.deployments.steps.git_clone`: This utility step is used to clone a repository from GitHub, Bitbucket, or GitLab as part of a deployment's pull action. You can provide a repository URL and optionally an access token to clone private repositories. The access token can be securely provided using a Secret block. Example usage:
Copy code
yaml
    pull:
      - prefect.deployments.steps.git_clone:
          repository: <https://bitbucket.org/org/repo.git>
          access_token: "{{ prefect.blocks.secret.bitbucket-token }}"
- `prefect.deployments.steps.run_shell_script`: This utility step allows for the execution of shell commands in a subprocess. It can be used to run scripts that need a specific environment or have particular I/O requirements. The step returns the standard output and standard error of the script being executed. Looking at your code snippet, it appears you are attempting to use these steps in sequence as part of a pull action:
Copy code
yaml
pull:
  - prefect.deployments.steps.git_clone:
      repository: <https://my_repo.git>
  - prefect.deployments.steps.run_shell_script:
      script: cd my_repo && pip install .
However, there are a few potential issues with the script: 1. In the script line of the
run_shell_script
, you've used
&&
instead of
&&
. In YAML, you shouldn't need to HTML-encode the ampersand. 2. Make sure to replace
<https://my_repo.git>
with the actual URL of your Git repository. With the corrections, it should look like:
Copy code
yaml
pull:
  - prefect.deployments.steps.git_clone:
      repository: <https://my_repo.git>
  - prefect.deployments.steps.run_shell_script:
      script: cd my_repo && pip install .
Make sure that the repository path and the subsequent directory change command (
cd
) correspond correctly after cloning the repository. If the repository is cloned into a directory with a different name, you need to adjust the script accordingly.
n
you've used
&&
instead of
&&
lol wut
disregard that bit - what i had should be what you want 👍
b
this is really helpful, thanks so much!
n
np! catjam
b
sorry one more related question. If i do want to set the working dir (based on the code you sent me) would that look like this?
Copy code
work_pool:
    type: process
    name: scraper-pool
    work_queue_name: default
    working_dir: /path/to/my/dir
fwiw somehow I did this and it overrode the tmpdir
Copy code
- prefect.deployments.steps.set_working_directory:
      directory: /path/to/my/dir
- prefect.deployments.steps.git_clone:
      repository: <https://my_repo.git>
n
in general if you want to override
job_variables
on a work pool for a given deployment (without changing the work pool globally for all deployments that pull their config from that work pool) you can do
Copy code
deployments:
  ...
  work_pool:
    type: process
    name: scraper-pool
    job_variables:
      working_dir: /path/to/my/dir
can you say more about why you need to set the work dir? there's a couple valid ways to do this and the one I'd recommend sort of depends on why you need to do it
b
i just really want to define where my github repo is cloned to.
prefect.deployments.steps.git_clone:
was perfect
n
here's an example that I think is doing what you want - i just have the
pwd
action for a sanity check
Copy code
> Running git_clone step...
 > Running run_shell_script step...
/Users/nate/src/demos/prefect-monorepo/test
my repo got cloned to
test/
wrt this repo root according to the
working_dir
i set for this deployment
b
ok great, thank you! it seems like you can specify this locally to a deployment or globally for all deployments, which is great to know. I am doing this, which is nice because I want this to be the case for all my deployments
Copy code
pull:
- prefect.deployments.steps.set_working_directory:
      directory: /path/to/my/dir
- prefect.deployments.steps.git_clone:
      repository: <https://my_repo.git>

deployments:
- name: bla1
  entrypoint: ...
  work_pool:
    name: my-pool
- name: bla2
  entrypoint: ...
  work_pool:
    name: my-pool
Hey @Nate it's possible things have changed in prefect but the
pip install
code you shared isn't quite right, just wanted to let you know for the future. When you do the git-clone, you are placed into the repo, so the
cd
will fail. It should actually be just this:
Copy code
pull:
  - prefect.deployments.steps.git_clone:
      repository: <https://my_repo.git>
  - prefect.deployments.steps.run_shell_script:
      script: pip install .
n
ah nice catch, i hadn’t checked it myself - that sounds right. and small comment as it relates to this
I want this to be the case for all my deployments
you could instead just set your working_dir on the work pool, since those configs will automatically apply to all deployments, then you don’t need an extra pull action to set the cwd for each deployment
b
yea the only reason I didn't do that is that you can't define the working_dir in your prefect.yaml file, so It was a bit "invisible" to other people. I like having every custom configuration in the yaml, so it could be changed and picked up by my CD. but that's fair, if i had a script to deploy my work-pool, I could specify it there
n
cool, fair point - yeah it’s a matter of preference, just mentioned in case you were thinking you had to do it in the pull step - nevermind!
b
i appreciate all the help! super useful, got me unblocked entirely
n
no problem! glad things are working for you