Hello, I am running an Azure DevOps pipeline with service connection set up to access my AWS account...
p

Payam K

over 3 years ago
Hello, I am running an Azure DevOps pipeline with service connection set up to access my AWS account. As a pipeline task, I run a sagemaker processing job that includes a Prefect flow.
- task: AWSShellScript@1
              inputs:
                # awsCredentials: 'xxxxx'
                regionName: 'xxx'
                scriptType: 'inline'
                inlineScript: python3 cli.py -remote S3 -p "['a','2020-09-01', '2020-09-02']"
I get this error:
[2022-02-03 21:29:07+0000] ERROR - prefect.S3Result | Unexpected error while reading from S3: TypeError('expected string or bytes-like object')
Traceback (most recent call last):
  File "/miniconda3/lib/python3.7/site-packages/prefect/engine/results/s3_result.py", line 142, in exists
    self.client.get_object(Bucket=self.bucket, Key=location.format(**kwargs))
  File "/miniconda3/lib/python3.7/site-packages/botocore/client.py", line 391, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/miniconda3/lib/python3.7/site-packages/botocore/client.py", line 692, in _make_api_call
    api_params, operation_model, context=request_context)
  File "/miniconda3/lib/python3.7/site-packages/botocore/client.py", line 738, in _convert_to_request_dict
    api_params, operation_model, context)
  File "/miniconda3/lib/python3.7/site-packages/botocore/client.py", line 770, in _emit_api_params
    params=api_params, model=operation_model, context=context)
  File "/miniconda3/lib/python3.7/site-packages/botocore/hooks.py", line 357, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/miniconda3/lib/python3.7/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/miniconda3/lib/python3.7/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/miniconda3/lib/python3.7/site-packages/botocore/handlers.py", line 238, in validate_bucket_name
    if not VALID_BUCKET.search(bucket) and not VALID_S3_ARN.search(bucket):
TypeError: expected string or bytes-like object
the same task runs good when I just run it on an Azure Devops agent:
- task: AWSShellScript@1
              inputs:
                # awsCredentials: 'xxxxx'
                regionName: 'xxx'
                scriptType: 'inline'
                inlineScript: python3 cli.py -local S3 -p "['a','2020-09-01', '2020-09-02']"
has anyone had this issue before?
Was hoping for a little help with a task caching issue... ```def cache_results_within_flow_run( ...
i

Isaac

12 months ago
Was hoping for a little help with a task caching issue...
def cache_results_within_flow_run(
    context: TaskRunContext, parameters: dict[str, Any]
) -> str:
    """Caches a task result within the context of the flow it is run in."""
    return f"{context.task_run.flow_run_id}:{context.task_run.task_key}"


@task(
    name="example",
    tags=["pipelines"],
    version=get_version(),
    retries=2,
    retry_delay_seconds=exponential_backoff(backoff_factor=60),
    retry_jitter_factor=0.5,
    on_failure=[alert_slack_on_task_failure],
    cache_key_fn=cache_results_within_flow_run,
)
def trademark_etl() -> None:
    """Task for running the earnings calls etl Prefect deployment."""
    deployment_name = "example-flow/example-deployment"

    run_prefect_deployment_check_successful(deployment_name=deployment_name)
We have been overhauling our orchestration and aren't seeing the expected behavior for caching. Most likely we are doing something incorrectly but not sure what. Our goal is to cache task results in the context of the flow they were run in, so that if the flow fails due to any of its tasks failing, we can retry the flow, and only the tasks that have not run successfully (in the flow being retried) will be run. I implemented a caching function that attempts to do this, however, this morning when one of our tasks failed and I went to retry the flow, each task started running as normal, without regard to having already completed in the same flow. Could it be that this is happening because we are not returning anything from our tasks?