Raviraja Ganta

    Raviraja Ganta

    4 months ago
    While registering the flow it is not storing the configs file which is present in a different directory. But some modules needs them. How to solve this?
    main-project
      | 
      src
         |
         | __init__.py
         | module1.py
         | module2.py
      |
      configs
         | 
         | config.yaml
      |
      flows
         |
         | sample_flow.py
    Anna Geller

    Anna Geller

    4 months ago
    Can you move code blocks to the thread?
    Raviraja Ganta

    Raviraja Ganta

    4 months ago
    Getting this error
    {"message": "Task 'configure_paths_task': Starting task run...", "levelname": "INFO", "filename": "task_runner.py", "module": "task_runner", "lineno": 241, "funcName": "run", "flow_name": "Clusters Creation Flow", "flow_run_id": "9508bc8f-454e-4b62-937b-88e30df3f03b", "task_name": "configure_paths_task", "task_slug": "configure_paths_task-1", "task_run_id": "8cca14d2-f97e-4594-a617-aa567e363fbf", "timestamp": "2022-05-05T10:50:21.913671+00:00"}
    {"message": "Task 'configure_paths_task': Exception encountered during task execution!", "exc_info": "Traceback (most recent call last):\n  File \"/usr/local/lib/python3.8/site-packages/prefect/engine/task_runner.py\", line 880, in get_task_run_state\n    value = prefect.utilities.executors.run_task_with_timeout(\n  File \"/usr/local/lib/python3.8/site-packages/prefect/utilities/executors.py\", line 468, in run_task_with_timeout\n    return task.run(*args, **kwargs)  # type: ignore\n  File \"/__w/python-monorepo/python-monorepo/taxonomy/flows/clusters.py\", line 25, in configure_paths_task\n  File \"/usr/local/lib/python3.8/site-packages/hydra/compose.py\", line 33, in compose\n    cfg = gh.hydra.compose_config(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/hydra.py\", line 559, in compose_config\n    cfg = self.config_loader.load_configuration(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 141, in load_configuration\n    return self._load_configuration_impl(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 227, in _load_configuration_impl\n    self.ensure_main_config_source_available()\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 129, in ensure_main_config_source_available\n    self._missing_config_error(\n  File \"/usr/local/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py\", line 103, in _missing_config_error\n    raise MissingConfigException(\nhydra.errors.MissingConfigException: Primary config directory not found.\nCheck that the config directory '/__w/python-monorepo/python-monorepo/taxonomy/config' exists and readable", "levelname": "ERROR", "filename": "task_runner.py", "module": "task_runner", "lineno": 910, "funcName": "get_task_run_state", "flow_name": "Clusters Creation Flow", "flow_run_id": "9508bc8f-454e-4b62-937b-88e30df3f03b", "task_name": "configure_paths_task", "task_slug": "configure_paths_task-1", "task_run_id": "8cca14d2-f97e-4594-a617-aa567e363fbf", "timestamp": "2022-05-05T10:50:22.295499+00:00"}
    Anna Geller

    Anna Geller

    4 months ago
    I thought you would come back because the solution you posted in the other thread doesn't solve the problem of packaging flow code, it only sets credentials I can only give the same answer as here that you need to package this code including those config files into your Docker image - you would need:
    COPY configs/ /opt/prefect/configs/
    Raviraja Ganta

    Raviraja Ganta

    4 months ago
    @Anna Geller what storage should i use here? I am trying to print the flow file path when running in agent.
    @task
    def print_cwd():
        logger = prefect.context.get("logger")
        <http://logger.info|logger.info>(os.getcwd())
        print(os.getcwd())
    
        <http://logger.info|logger.info>(os.path.dirname(__file__))
    It is printing os.getcwd() value as
    /app
    . But somehow it is printing os.path.dirname as the path of the file in localsystem. So though I have configs folder present in /app/configs it is not able to read it and is looking at <localsystem_path>/configs.
    {"message": "/app", "levelname": "INFO", "filename": "clusters.py", "module": "clusters", "lineno": 44, "funcName": "print_cwd", "flow_name": "Clusters Creation Flow", "flow_run_id": "493988b6-a7f0-4b81-a0c5-9332d623c4ca", "task_name": "print_cwd", "task_slug": "print_cwd-1", "task_run_id": "98a63ed2-2153-44e0-a7c4-86b27eafb7ed", "timestamp": "2022-05-05T16:53:28.056764+00:00"}
    {"message": "/Users/raviraja/Desktop/python-monorepo/taxonomy/flows", "levelname": "INFO", "filename": "clusters.py", "module": "clusters", "lineno": 47, "funcName": "print_cwd", "flow_name": "Clusters Creation Flow", "flow_run_id": "493988b6-a7f0-4b81-a0c5-9332d623c4ca", "task_name": "print_cwd", "task_slug": "print_cwd-1", "task_run_id": "98a63ed2-2153-44e0-a7c4-86b27eafb7ed", "timestamp": "2022-05-05T16:53:28.057057+00:00"}
    Anna Geller

    Anna Geller

    4 months ago
    you could actually set this in your Dockerfile using the WORKDIR command. The default working directory is /opt/prefect if you leverage base Prefect image. What base image do you use? /app is used in Python base image usually
    Raviraja Ganta

    Raviraja Ganta

    4 months ago
    FROM prefecthq/prefect:latest-python3.8
    RUN apt-get update && apt-get install -y git && apt-get -y install gcc && apt-get install wget -y
    COPY ./ /app
    WORKDIR /app
    COPY requirements.txt requirements.txt
    RUN pip install -r requirements.txt --no-cache-dir
    RUN pip install .
    Anna Geller

    Anna Geller

    4 months ago
    thanks. In that case to add configs you may do that explicitly using:
    COPY configs/ /app/configs/
    Raviraja Ganta

    Raviraja Ganta

    4 months ago
    this will do that already.
    COPY ./ /app
    I added a task to print the items of dir /app
    ['setup.py', 'config', 'mlinfra', 'tests', 'lambdas', 'notebooks', 'flows', 'iolib', 'dataprocessing', 'gpt', 'dockerfile', 'README.md', 'requirements.txt', 'taxonomy']
    somehow the flow is printing path as
    "/Users/raviraja/Desktop/python-monorepo/taxonomy/flows"
    for this log statement
    <http://logger.info|logger.info>(os.path.dirname(__file__))
    .
    Anna Geller

    Anna Geller

    4 months ago
    probably an easier way of testing this would be to exec into the container and browse the file structure this way
    Raviraja Ganta

    Raviraja Ganta

    4 months ago
    the container does not contain flow code right? it should be fetched from storage dynamically whenever a flow is scheduled and the container dies once the flow is done right?
    Anna Geller

    Anna Geller

    4 months ago
    it depends on how you build the image - usually, no, your image doesn't need to contain your flow, this is what Storage is for
    and yes, when the flow run finishes, the container finishes too