Can anyone help me out with how to manage dynamic ...
# ask-community
f
Can anyone help me out with how to manage dynamic dependencies in prefect flows? I'm trying to create a pipeline in which a process is triggered and a python script is executued. Everytime a python script is submitted, first its dependencies should be installed, and then the script should be executed. How can I handle dynamic pip packages in this context? Tried to run prefect by docker in docker but the overhead is too much both from networking PoV and also resource wise. Currently trying to go for python venvs + a way to manage pip dependencies.
n
perhaps you could do something like this?
Copy code
In [2]: import subprocess
   ...: import venv
   ...: import shutil
   ...: from contextlib import contextmanager
   ...:
   ...: from prefect.logging.loggers import get_logger
   ...:
   ...: logger = get_logger()
   ...:
   ...: @contextmanager
   ...: def temp_venv():
   ...:     venv_dir = 'temp_venv'
   ...:     <http://logger.info|logger.info>('Creating virtual environment...')
   ...:     venv.create(venv_dir, with_pip=True)
   ...:     venv_python = f'{venv_dir}/bin/python'
   ...:     try:
   ...:         yield venv_python
   ...:     finally:
   ...:         <http://logger.info|logger.info>('Deleting virtual environment...')
   ...:         shutil.rmtree(venv_dir)
   ...:         <http://logger.info|logger.info>('Virtual environment deleted.')
   ...:
   ...: # could call this part within a task if you want
   ...: with temp_venv() as venv_python:
   ...:     # Perform operations inside the virtual environment
   ...:     <http://logger.info|logger.info>('Installing numpy...')
   ...:     subprocess.run([venv_python, '-m', 'pip', 'install', 'numpy']) # -r requirements.txt
   ...:     <http://logger.info|logger.info>('numpy installed.')
   ...:
   ...:     # And run a script
   ...:     <http://logger.info|logger.info>('Running script...')
   ...:     subprocess.run([venv_python, 'my_script.py'])
   ...:     <http://logger.info|logger.info>('Script completed.')
r
We had no luck installing / upgrading packages within Flows, albeit working with an AWS environment. The upgrades would indeed process but Python loads its libraries at runtime and could not be persuaded to reload them. What we ended up doing was modifying the
EXTRA_PIP_PACKAGES
environment variable and forcing AWS to install whatever was set there. Apparently Prefect’s
entrypoint.sh
script will automatically do this for non-AWS environments so setting the variable may be enough. Good luck!
n
its possible to pip install packages within a flow and then immediately use them, we do it here -
EXTRA_PIP_PACKAGES
would work if you wanted to have a different set of dependencies per container or something but OP said
the overhead is too much both from networking PoV and also resource wise
r
@Nate is it necessary to use a temp_venv like you do in the above example when installing packages within a flow? Because I had absolutely no luck actually getting Prefect to use newly installed packages. In my use case I was upgrading an existing package; I could reload it, I could see the updated code, but it wouldn’t apply it.
n
is it necessary to use a temp_venv like you do in the above example when installing packages within a flow?
no i dont think its necessary to do what I did above, I'm sure there are other ways to do it. i didn't thoroughly test that solution, its just similar to something thats worked for me before
r
OK thanks for clarifying. I’ll give it a whirl and see how it goes
n
👍