https://prefect.io logo
Title
i

Idan

01/10/2023, 3:50 PM
We run into a weird edge case and I'm wondering what would be the Prefect way around this 🙂 We have a mounted drive where only legitimate users can write to, so in one flow, one task calls
os.seteuid(priv_user_id)
(and then later resets to
original_user_id
). This is run in a container, so the
original_user_id
is
root
, which entails
PREFECT_HOME
is
/root/.prefect/
. Our tasks are long-running, so we also cache them. Then, every now and then, a task succeeds, but fails with:
Crash detected! Execution was interrupted by an unexpected exception: Traceback (most recent call last):
  File "/usr/lib/python3.10/pathlib.py", line 1175, in mkdir
    self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/root/.prefect/storage'

During handling of the above exception, another exception occurred:

PermissionError: [Errno 13] Permission denied: '/root/.prefect/storage'
Any smart ideas how to facilitate both needs?
z

Zanie

01/10/2023, 5:57 PM
Hm sounds like you’re failing when writing results, you can change that directory to somewhere else?
i

Idan

01/10/2023, 7:27 PM
I'd like to change that in the container deployment, but I don't see how the SDK allows passing a different settings context or PREFECT_HOME?
Also, wouldn't it then crash for all tasks? Some tasks succeed, so it feels like some race condition 🤔
z

Zanie

01/10/2023, 7:30 PM
You can change the settings in the container with environment variables
I’m not sure what the race condition is here without more details, it seems dangerous to change the uids during the flow.
i

Idan

01/10/2023, 7:38 PM
Hm, correct me if I'm wrong, but will changing the environment variable in the dockerized agent also affect any dockerized flows it spawns?
z

Zanie

01/10/2023, 7:39 PM
I don’t think we pass through the current environment to children containers, just processes
We do however copy settings through, so I think that would work
i

Idan

01/10/2023, 7:40 PM
Hm. And then still the dockerized flow starts as root, creates that folder, and thus has permissions locked to root I believe 🤔
z

Zanie

01/10/2023, 7:41 PM
Also yeah I’m not sure if I answered this clearly but
DockerContainer(env={"A_SETTING": "value"})
will override whatever the agent has set.
👍 1
It is possible that the
mkdir
calls are happening concurrently and that there’s a race condition in the Python stdlib with
exists_ok
? It’s weird that it’d manifest as a permission error though.
i

Idan

01/10/2023, 7:45 PM
I guess if you can't
os.access
then you can't verify the directory exists. Wonder if I could just start specific tasks/flows with a specific user ID 🤔
z

Zanie

01/10/2023, 7:46 PM
It seems like you can control the flow with
user
from https://docker-py.readthedocs.io/en/stable/containers.html
(I think we’d need to add support to our
DockerContainer
abstraction)
Changing the UID per task seems challenging since we’re not doing multiprocessing
i

Idan

01/10/2023, 7:48 PM
That would be great :D
I was not a fan of changing the UID, we expected a classic YMMV, and before adding the caching functionality for each task, it worked nicely
@Zanie if I'm reading the code correctly, it should be enough to add the
user
field to
DockerContainer
and then to
_build_container_settings
?
For example, I added this snippet and user it as a deployment
class DockerContainerWithUser(DockerContainer):
    user: Optional[Union[str, int]] = Field(
        default=None,
        description='The username or user ID to use in the docker',
    )

    def _build_container_settings(
        self,
        docker_client: "DockerClient",
    ) -> Dict:
        return {**super()._build_container_settings(docker_client=docker_client), "user": self.user}
But printing
os.getuid()
in the beginning of the flow still prints
0
🤔
z

Zanie

01/11/2023, 3:42 PM
Hm, it seems feasible that your class was not used. You might need to define a unique
type
string for your subclass (I have not played around with extending in that way yet).
Although perhaps that happens automatically these days. The mechanics of blocks are changing. Note you can also just do
DockerContainer.run(command=[…])
for faster testing and that would avoid any confusion about if your class is used.
i

Idan

01/11/2023, 5:55 PM
My guess is the contents of the deployment are only used to push to the server, but the agent does not see that?🤔
Some progress was made by indeed adding the unique
type
as suggested 👍 Now the agent complains (successfully?)
/usr/local/lib/python3.10/site-packages/prefect/agent.py:353: UserWarning: Block document has schema checksum sha256:b3f4df5ac83c5075a8d4b2ee8078c56c429e90af94e4cb3befd0bf6795a7176f which does not match the schema checksum for class 'DockerContainer'. This indicates the schema has changed and this block may not load.
  infrastructure_block = Block._from_block_document(infra_document)
08:48:51.121 | ERROR   | prefect.agent - Failed to get infrastructure for flow run 'f01c0ecc-8dda-4392-bfdf-5d7f9f5cf9e3'.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 372, in submit_run
    infrastructure = await self.get_infrastructure(flow_run)
  File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 353, in get_infrastructure
    infrastructure_block = Block._from_block_document(infra_document)
  File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 554, in _from_block_document
    block = block_cls.parse_obj(block_document.data)
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 175, in __init__
    super().__init__(*args, **kwargs)
  File "pydantic/main.py", line 342, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for DockerContainer
type
  unexpected value; permitted: 'docker-container' (type=value_error.const; given=docker-container-w-user; permitted=('docker-container',))
08:52:34.454 | INFO    | prefect.agent - Submitting flow run 'e4e9752b-f106-4548-b4e1-7b9ef095f1bd'
08:52:34.699 | ERROR   | prefect.agent - Failed to get infrastructure for flow run 'e4e9752b-f106-4548-b4e1-7b9ef095f1bd'.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 372, in submit_run
    infrastructure = await self.get_infrastructure(flow_run)
  File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 353, in get_infrastructure
    infrastructure_block = Block._from_block_document(infra_document)
  File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 554, in _from_block_document
    block = block_cls.parse_obj(block_document.data)
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "/usr/local/lib/python3.10/site-packages/prefect/blocks/core.py", line 175, in __init__
    super().__init__(*args, **kwargs)
  File "pydantic/main.py", line 342, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for DockerContainer
type
  unexpected value; permitted: 'docker-container' (type=value_error.const; given=docker-container-w-user; permitted=('docker-container',))
(Sorry for spam, mostly writing this in case someone in the future runs into similar issues) It seems the Docker Container block supports the
user
argument natively, just the Python SDK does not 🤔
z

Zanie

01/12/2023, 5:47 PM
That error looks like your custom type was not registered or you didn’t update your deployment after adding the type variable? Not entirely sure. You can add registration of your custom type by adding its module to https://docs.prefect.io/api-ref/prefect/settings/#prefect.settings.PREFECT_EXTRA_ENTRYPOINTS
i

Idan

01/13/2023, 9:29 AM
Interesting; I could not find much documentation on extending and/or adding custom types/modules to prefect
I found how to register the type with
from prefect.utilities.dispatch import register_type
, but I imagine that has no effect on the agent itself?
Made a small PR to add this 🤷‍♂️