https://prefect.io logo
Title
j

Joshua Grant

01/30/2023, 11:48 PM
Hello! I'm having an issue with
prefect==2.7.10
using
ECSTask
from
prefect-aws==0.2.4
, in which running the flow is baked into the Docker image but
prefect
copies over the code, which results in a
shutil.Error
. Before upgrading, we used the
GitHub
storage block in the command to
Deployment.build_from_flow()
as we encountered this error when using the
path
parameter with
prefect==2.6.9
. Details in ๐Ÿงต.
Error:
Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "/opt/bitnami/python/lib/python3.9/site-packages/prefect/engine.py", line 269, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/opt/bitnami/python/lib/python3.9/site-packages/prefect/client/utilities.py", line 47, in with_injected_client
    return await fn(*args, **kwargs)
  File "/opt/bitnami/python/lib/python3.9/site-packages/prefect/deployments.py", line 175, in load_flow_from_flow_run
    await storage_block.get_directory(from_path=deployment.path, local_path=".")
  File "/opt/bitnami/python/lib/python3.9/site-packages/prefect/filesystems.py", line 942, in get_directory
    copytree(
  File "/opt/bitnami/python/lib/python3.9/shutil.py", line 566, in copytree
    return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
  File "/opt/bitnami/python/lib/python3.9/shutil.py", line 522, in _copytree
    raise Error(errors)
relevant deployment code:
from os import getenv
from pydantic import SecretStr
from prefect.deployments import Deployment
from prefect.filesystems import GitHub
from prefect_aws.ecs import AwsCredentials, ECSTask
from pyaml_env import parse_config
from flow_file import my_flow

aws_creds = AwsCredentials()
github_block = GitHub(repository="url_to_repository",
 reference=getenv('GIT_TAG'), access_token=SecretStr(getenv('GIT_API_KEY')))
task_def = parse_config('task_definition.yaml')
ecs_task_fg = ECSTask(
    aws_credentials=aws_credentials_block,
    image=docker_image,
    cpu=4096,
    memory=8192,
    stream_output=True,
    configure_cloudwatch_logs=False,
    cluster=os.getenv('ECS_CLUSTER'),
    execution_role_arn=os.getenv('FLOW_EXECUTION_ROLE'),
    task_role_arn=getenv('FLOW_TASK_ROLE'),
    vpc_id=getenv('VPC_ID'),
    task_definition=task_def,
)
deployment_fg = Deployment.build_from_flow(
    flow=excel_flow,
    name=f'{getenv("IMAGE_NAME")}-fg',
    version='1',
    work_queue_name=getenv('IMAGE_NAME'),
    infrastructure=ecs_task_fg,
    storage=github_block,
)

aws_credentials_block.save('aws-creds', overwrite=True)
ecs_task_fg.save(f'{getenv("IMAGE_NAME")}-ecs-task-fg', overwrite=True)
deployment_fg.apply()
Ideally, I would like to use the flow baked into the Docker image without having to worry about the
GitHub
block. However, since upgrading, both the
GitHub
storage block or specifying
path='/app/'
result in the copy error.
Prior to upgrading, the
storage=github_block
method was working, but specifying
path='/app/'
resulted in the copy error, now both methods result in the copy error.
The contents of
task_definition.yaml
species sidecar containers to load with the task for routing logs to DataDog. It uses the
awsfirelens
logDriver, which incorrectly reports:
23:35:26.125 | WARNING | prefect.infrastructure.ecs-task - ECSTask '588acc00-5ee0-4ce8-b65d-1086106f49ec': Logging configuration uses unsupported  driver {container_def['logConfiguration'].get('logDriver')!r}. Output cannot be streamed.
but that is a separate issue and is only a warning.
The
GitHub
storage solution works if specifying
path=None
when calling
Deployment.build_from_flow()
. However, I would still like to be able to use the flow baked in to the docker image.
p

Peyton Runyan

01/31/2023, 2:24 PM
What version of prefect are you using?
My bad - it's all in the parent post
I just skimmed, but the
shutil
error with a specified path, using the github block sounds a lot an issue that another user was facing. Does this help: https://github.com/PrefectHQ/prefect/pull/8193
g = GitHub(
    repository="<https://github.com/PrefectHQ/prefect.git>",
    include_git_objects=False,
)
j

Joshua Grant

01/31/2023, 2:29 PM
If possible, I'd like to avoid using the github block all together.
Also, the odd behavior of having to manually specify
path=None
, otherwise it assigns the
WORKDIR
of the image as the path
p

Peyton Runyan

01/31/2023, 2:35 PM
We have first class support for baked in flows coming out in the relatively near future, so that should help
Just to double check - are you currently able to run your code, but the arrangement is just suboptimal until there's better support for baked-in flows?
j

Joshua Grant

01/31/2023, 2:38 PM
I am currently able to run my code using the GitHub storage block, but only when specifically specifying
path=None
in the call to
Deployment.build_from_flow()
. Looking forward for the baked in flow support! Thanks!
๐Ÿ™ 1
๐Ÿ‘ 1