While registering the flow it is not storing the c...
# prefect-community
r
While registering the flow it is not storing the configs file which is present in a different directory. But some modules needs them. How to solve this?
Copy code
main-project
  | 
  src
     |
     | __init__.py
     | module1.py
     | module2.py
  |
  configs
     | 
     | config.yaml
  |
  flows
     |
     | sample_flow.py
a
Can you move code blocks to the thread?
r
Getting this error
Copy code
{"message": "Task 'configure_paths_task': Starting task run...", "levelname": "INFO", "filename": "task_runner.py", "module": "task_runner", "lineno": 241, "funcName": "run", "flow_name": "Clusters Creation Flow", "flow_run_id": "9508bc8f-454e-4b62-937b-88e30df3f03b", "task_name": "configure_paths_task", "task_slug": "configure_paths_task-1", "task_run_id": "8cca14d2-f97e-4594-a617-aa567e363fbf", "timestamp": "2022-05-05T10:50:21.913671+00:00"}
{"message": "Task 'configure_paths_task': Exception encountered during task execution!", "exc_info": "Traceback (most recent call last):\n  File \"/usr/local/lib/python3.8/site-packages/prefect/engine/task_runner.py\", line 880, in get_task_run_state\n    value = prefect.utilities.executors.run_task_with_timeout(\n  File \"/usr/local/lib/python3.8/site-packages/prefect/utilities/executors.py\", line 468, in run_task_with_timeout\n    return task.run(*args, **kwargs)  # type: ignore\n  File \"/__w/python-monorepo/python-monorepo/taxonomy/flows/clusters.py\", line 25, in configure_paths_task\n  File \"/usr/local/lib/python3.8/site-packages/hydra/compose.py\", line 33, in compose\n    cfg = gh.hydra.compose_config(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/hydra.py\", line 559, in compose_config\n    cfg = self.config_loader.load_configuration(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 141, in load_configuration\n    return self._load_configuration_impl(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 227, in _load_configuration_impl\n    self.ensure_main_config_source_available()\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 129, in ensure_main_config_source_available\n    self._missing_config_error(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 103, in _missing_config_error\n    raise MissingConfigException(\nhydra.errors.MissingConfigException: Primary config directory not found.\nCheck that the config directory '/__w/python-monorepo/python-monorepo/taxonomy/config' exists and readable", "levelname": "ERROR", "filename": "task_runner.py", "module": "task_runner", "lineno": 910, "funcName": "get_task_run_state", "flow_name": "Clusters Creation Flow", "flow_run_id": "9508bc8f-454e-4b62-937b-88e30df3f03b", "task_name": "configure_paths_task", "task_slug": "configure_paths_task-1", "task_run_id": "8cca14d2-f97e-4594-a617-aa567e363fbf", "timestamp": "2022-05-05T10:50:22.295499+00:00"}
👍 1
a
I thought you would come back because the solution you posted in the other thread doesn't solve the problem of packaging flow code, it only sets credentials I can only give the same answer as here that you need to package this code including those config files into your Docker image - you would need:
Copy code
COPY configs/ /opt/prefect/configs/
r
@Anna Geller what storage should i use here? I am trying to print the flow file path when running in agent.
Copy code
@task
def print_cwd():
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>(os.getcwd())
    print(os.getcwd())

    <http://logger.info|logger.info>(os.path.dirname(__file__))
It is printing os.getcwd() value as
/app
. But somehow it is printing os.path.dirname as the path of the file in localsystem. So though I have configs folder present in /app/configs it is not able to read it and is looking at <localsystem_path>/configs.
Copy code
{"message": "/app", "levelname": "INFO", "filename": "clusters.py", "module": "clusters", "lineno": 44, "funcName": "print_cwd", "flow_name": "Clusters Creation Flow", "flow_run_id": "493988b6-a7f0-4b81-a0c5-9332d623c4ca", "task_name": "print_cwd", "task_slug": "print_cwd-1", "task_run_id": "98a63ed2-2153-44e0-a7c4-86b27eafb7ed", "timestamp": "2022-05-05T16:53:28.056764+00:00"}
{"message": "/Users/raviraja/Desktop/python-monorepo/taxonomy/flows", "levelname": "INFO", "filename": "clusters.py", "module": "clusters", "lineno": 47, "funcName": "print_cwd", "flow_name": "Clusters Creation Flow", "flow_run_id": "493988b6-a7f0-4b81-a0c5-9332d623c4ca", "task_name": "print_cwd", "task_slug": "print_cwd-1", "task_run_id": "98a63ed2-2153-44e0-a7c4-86b27eafb7ed", "timestamp": "2022-05-05T16:53:28.057057+00:00"}
a
you could actually set this in your Dockerfile using the WORKDIR command. The default working directory is /opt/prefect if you leverage base Prefect image. What base image do you use? /app is used in Python base image usually
r
Copy code
FROM prefecthq/prefect:latest-python3.8
RUN apt-get update && apt-get install -y git && apt-get -y install gcc && apt-get install wget -y
COPY ./ /app
WORKDIR /app
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt --no-cache-dir
RUN pip install .
a
thanks. In that case to add configs you may do that explicitly using:
Copy code
COPY configs/ /app/configs/
r
this will do that already.
COPY ./ /app
I added a task to print the items of dir /app
Copy code
['setup.py', 'config', 'mlinfra', 'tests', 'lambdas', 'notebooks', 'flows', 'iolib', 'dataprocessing', 'gpt', 'dockerfile', 'README.md', 'requirements.txt', 'taxonomy']
somehow the flow is printing path as
"/Users/raviraja/Desktop/python-monorepo/taxonomy/flows"
for this log statement
<http://logger.info|logger.info>(os.path.dirname(__file__))
.
a
probably an easier way of testing this would be to exec into the container and browse the file structure this way
r
the container does not contain flow code right? it should be fetched from storage dynamically whenever a flow is scheduled and the container dies once the flow is done right?
a
it depends on how you build the image - usually, no, your image doesn't need to contain your flow, this is what Storage is for
and yes, when the flow run finishes, the container finishes too