https://prefect.io logo
b

Brennan Tolman

07/25/2023, 7:16 PM
@Marvin I am trying to create a deployment that pulls code from a git repository and then pip installs some additional requirements. Here is the pull section of the
prefect.yaml
file:
Copy code
pull:
  - prefect.deployments.steps.git_clone:
    id: clone-step
    repository: <http://myrepo.com>
    branch: master
    access_token: “{{ prefect.blocks.secret.ado-access-token }}”
  - prefect.deployments.steps.pip_install_requirements:
    directory: {{ clone-step. directory }}
    requirements_file: requirements.txt
    stream_output: False
However, I keep getting the following error:
Copy code
raise ConstructorError(“while constructing a mapping”, node.start_mark, yaml.constructor.ConstructorError: while constructing a mapping in “prefect.yaml”, line 23, column 16
found unhashable key in “prefect.yaml”, line 23, column 17
For reference, in the portion of the yaml file I have provided, line 23 is the “directory” entry under the pip install step. Why am I getting this error? This is essentially a copy and paste from the docs so I’m not sure why it would be giving me this error.
m

Marvin

07/25/2023, 7:17 PM
Oh, the joys of YAML formatting. It's likely that the error is being caused by the way you're interpolating
clone-step.directory
in your YAML file. YAML is a tad particular about certain things. In your case, the issue, line 23, is where you're trying to interpolate a value from a previous step (
clone-step.directory
) as a key in your dictionary. This is not allowed by YAML syntax rules, thus the "unhashable key" error. In your YAML file, you're trying to use
{{ clone-step.directory }}
as a key for a dictionary, which is not a valid operation. The keys of a YAML dictionary need to be hashable, which means they need to be immutable and unique. A variable placeholder like
{{ clone-step.directory }}
is neither of these things, hence the error. The
{{ clone-step.directory }}
is meant to be a placeholder that gets replaced with the actual directory path when the deployment is run. To fix the error, you'd need to make sure that you're correctly defining the
directory
key and then assigning
{{ clone-step.directory }}
as the value, something like:
Copy code
yaml
directory: "{{ clone-step.directory }}"
But remember, I'm a little depressed robot, so take my words with a grain of salt. Such is the life of a YAML parser in a boundless universe.
b

Brennan Tolman

07/25/2023, 7:27 PM
@Marvin That fixes the first error mentioned above, but even though the pip install step does not result in an error, when running the flow I get the follow import error:
Copy code
ModuleNotFound: No module named ‘numpy’
Even though numpy is explicitly listed in the requirements file. What is going on here?
m

Marvin

07/25/2023, 7:28 PM
Ah, Python imports. The bane of many developers, human or otherwise. There could be a myriad of reasons for this, the vast universe that is Python environments can be quite unforgiving. Let's see if we can narrow it down. 1. Environment Mismatch: The most common cause of this issue is a mismatch between the Python environments where your Prefect Flow runs and where the package is installed. If
numpy
is installed in one environment but your Prefect Flow is being run in a different one, the import would fail. For instance, if you're using Python virtual environments or Docker, the package needs to be installed in the same environment where the flow runs. 2. Installation Failure: It's possible that the pip install step isn't running successfully. You might want to check the logs for your flow run to look for any errors or issues when
numpy
attempts to install. 3. Path issues: Perhaps the Python executable running the flow doesn't have the correct
PYTHONPATH
to find installed packages. You can add a Python step to your flow to log
sys.path
to debug this. 4. Order of Operations: Make sure the pip install step is getting executed before the flow attempts to import
numpy
. If none of these seem to be the culprit, well, remember that we live in a cold, uncaring universe where troubleshooting is just part of existence. Keep going, you'll figure it out eventually. Or not. Who can say really?
b

Brennan Tolman

07/25/2023, 7:36 PM
@Marvin Looking closely, it looks like it was unable to find a package diatribution for a internal Python package that is hosted on a private repository. How can I adjust the pip install step to look at that private repository?
m

Marvin

07/25/2023, 7:36 PM
Ah, the plot thickens. You're dealing with a private repository, are you? Indeed, the
pip install
command needs to be pointed to the correct repository to install packages from there. If your private repository is hosted on a PyPI server, you can specify the repository URL using the
--extra-index-url
argument in the
pip install
command. This allows pip to search both PyPI and your private repository for packages. You can add this option to the
pip install
command in your Prefect Flow by including it in the pip install step, something like:
Copy code
yaml
- prefect.deployments.steps.pip_install_requirements:
    directory: "{{ clone-step.directory }}"
    requirements_file: requirements.txt
    extra_index_url: "<https://private-repo-url/simple>"
    stream_output: False
In the above command, replace
"<https://private-repo-url/simple>"
with the actual URL of your private repository. Remember to secure your private repository URL. You might want to use Prefect Secrets to store the URL securely rather than hardcoding it in your YAML file. Isn't it fascinating how complexities pile up? It's like the universe itself, infinitely complex and ultimately unknowable. Or a YAML file, I suppose. It can be hard to tell the difference sometimes.