https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • a

    Andor Tóth

    05/22/2020, 2:12 PM
    i have several queries (.sql files) to run
  • a

    Andor Tóth

    05/22/2020, 2:12 PM
    exec_query task runs a task by paramater
  • a

    Andor Tóth

    05/22/2020, 2:12 PM
    but seeing that exec_query[13] is executed is not much help
    j
    • 2
    • 1
  • a

    Andor Tóth

    05/22/2020, 2:13 PM
    ah, okay
  • a

    Andor Tóth

    05/22/2020, 2:14 PM
    thx
  • w

    Will Milner

    05/22/2020, 3:20 PM
    quick question on running flows in a distributed system like kubernetes. When I run a flow, do all tasks in that flow run in the same execution environment, or could some tasks run on one environment and the rest in a different envrionment?
    j
    • 2
    • 2
  • m

    Marwan Sarieddine

    05/22/2020, 4:36 PM
    I am facing an issue autoscaling from 0 nodes - using the aws auto-scaler on EKS - note the issue only arises when autoscaling from 0 and not 1 node … I see - someone recommended using something like this for waiting on dask worker pods
    @task
    def wait_for_resources():
        client = get_client()
        # Wait until we have 10 workers
        client.wait_for_workers(n_workers=10)
    but this doesn’t seem to work for waiting on the first node to be present Has anyone had the chance to try out auto-scaling from 0?
    j
    j
    j
    • 4
    • 64
  • i

    itay livni

    05/22/2020, 6:03 PM
    Hi - I am having a hard time understanding how to implement a mapped task with a target. Running locally with an S3Result at the flow level works without any special configuration. When using Docker Storage an error occurs :
    Traceback (most recent call last):
      File "/opt/prefect/healthcheck.py", line 136, in <module>
        result_check(flows)
      File "/opt/prefect/healthcheck.py", line 64, in result_check
        _check_mapped_result_templates(flow)
      File "/opt/prefect/healthcheck.py", line 58, in _check_mapped_result_templates
        "Mapped tasks with custom result locations must include {filename} as a template in their location - see <https://docs.prefect.io/core/advanced_tutorials/using-results.html#specifying-a-location-for-mapped-or-looped-tasks>"
    ValueError: Mapped tasks with custom result locations must include {filename} as a template in their location - see <https://docs.prefect.io/core/advanced_tutorials/using-results.html#specifying-a-location-for-mapped-or-looped-tasks>
    The documentation says "When configuring results for a mapped pipeline, if you choose to configure the location it is required that you include `{filename}`". Is filename a context (I couldn't find it)? https://docs.prefect.io/core/advanced_tutorials/using-results.html#mapping In short how does a configuration for a mapped task look like? With no result configured on the task level. Or is that required?
    j
    c
    m
    • 4
    • 13
  • m

    Marwan Sarieddine

    05/22/2020, 10:53 PM
    Hi everyone - how is
    timeout
    supposed to work on a mapped task ? it seems to me that timeout is not being enforced - is there a way to see the timeout specification in the UI ?
    c
    • 2
    • 3
  • j

    Jacques Jamieson

    05/23/2020, 12:46 AM
    ERROR: Command errored out with exit status 1:
       command: 'c:\users\jacqu\appdata\local\programs\python\python38-32\python.exe' 'c:\users\jacqu\appdata\local\programs\python\python38-32\lib\site-packages\pip\_vendor\pep517\_in_process.py' build_wheel 'C:\Users\jacqu\AppData\Local\Temp\tmpvx6omvn3'
           cwd: C:\Users\jacqu\AppData\Local\Temp\pip-install-dzkgs_w4\pendulum
      Complete output (24 lines):
      Traceback (most recent call last):
        File "setup.py", line 2, in <module>
          from setuptools import setup
      ModuleNotFoundError: No module named 'setuptools'
      Traceback (most recent call last):
        File "c:\users\jacqu\appdata\local\programs\python\python38-32\lib\site-packages\pip\_vendor\pep517\_in_process.py", line 280, in <module>
          main()
        File "c:\users\jacqu\appdata\local\programs\python\python38-32\lib\site-packages\pip\_vendor\pep517\_in_process.py", line 263, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "c:\users\jacqu\appdata\local\programs\python\python38-32\lib\site-packages\pip\_vendor\pep517\_in_process.py", line 204, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "C:\Users\jacqu\AppData\Local\Temp\pip-build-env-31qm87lt\overlay\Lib\site-packages\poetry\core\masonry\api.py", line 57, in build_wheel
          return unicode(WheelBuilder.make_in(poetry, Path(wheel_directory)))
        File "C:\Users\jacqu\AppData\Local\Temp\pip-build-env-31qm87lt\overlay\Lib\site-packages\poetry\core\masonry\builders\wheel.py", line 56, in make_in
          wb.build()
        File "C:\Users\jacqu\AppData\Local\Temp\pip-build-env-31qm87lt\overlay\Lib\site-packages\poetry\core\masonry\builders\wheel.py", line 82, in build
          self._build(zip_file)
        File "C:\Users\jacqu\AppData\Local\Temp\pip-build-env-31qm87lt\overlay\Lib\site-packages\poetry\core\masonry\builders\wheel.py", line 101, in _build
          self._run_build_command(setup)
        File "C:\Users\jacqu\AppData\Local\Temp\pip-build-env-31qm87lt\overlay\Lib\site-packages\poetry\core\masonry\builders\wheel.py", line 129, in _run_build_command
          subprocess.check_call(
        File "c:\users\jacqu\appdata\local\programs\python\python38-32\lib\subprocess.py", line 364, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['c:\\users\\jacqu\\appdata\\local\\programs\\python\\python38-32\\python.exe', 'setup.py', 'build', '-b', 'build']' returned non-zero exit status 1.
      ----------------------------------------
      ERROR: Failed building wheel for pendulum
    Failed to build pendulum
    ERROR: Could not build wheels for pendulum which use PEP 517 and cannot be installed directly
  • j

    Jacques Jamieson

    05/23/2020, 12:46 AM
    any one encountered this on windows 10?
    c
    • 2
    • 2
  • j

    Jacques Jamieson

    05/23/2020, 12:46 AM
    python 3.8
  • b

    Brad

    05/23/2020, 5:00 AM
    Hey team - I’m using the new
    Result
    class, have the
    validators
    been hooked up yet? I’m trying to replicate some
    cache_for
    timdelta logic
    j
    • 2
    • 1
  • q

    Questionnaire

    05/23/2020, 2:08 PM
    Hello folks, I want to schedule my flow for every midnight. I'm using CronClock which by default follows DST, What should I do if I want to use the timezone as per my deployment server or UTC?
    j
    • 2
    • 8
  • h

    Hassan Javeed

    05/23/2020, 2:48 PM
    Noticed this during one of our flow runs, prefect cloud UI reporting duration as < 1 second, whereas it actually took more than 24 hours and finished the following day. A UI bug ?
    j
    n
    • 3
    • 8
  • p

    Pedro Machado

    05/23/2020, 9:33 PM
    Hi there. I am trying to understand the default secrets. Are the default AWS credential secrets only for use with tasks from the task library or are they supposed to be passed to all instantiations of the boto3 client inside of a prefect task? I defined
    PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS
    in a
    .env
    file that I am loading with
    dotenv.load_dotenv()
    The env variables are being passed to the Python script OK. Then inside a task, I create an s3 client that is used in a module I created (seudo code):
    @tasks
    def mytask:
        api_client = MyPrivateClient(s3_client=boto3.client("s3"))
        api_client.execute()
    Then I get
    NoCredentialsError
    After reading https://docs.prefect.io/core/concepts/secrets.html#default-secrets I thought the credentials would be exposed as environment variables for the client to use. Should I be getting them explicitly from the context or using the default env vars that boto3 will look for?
    c
    • 2
    • 3
  • p

    Pedro Machado

    05/24/2020, 12:22 AM
    I am trying to run prefect server on Ubuntu running inside of Windows Subsystem for Linux 2. I am getting the following error when trying to start the server.
    prefect server start
    Pulling postgres  ... done
    Pulling hasura    ... done
    Pulling graphql   ... done
    Pulling apollo    ... done
    Pulling scheduler ... done
    Pulling ui        ... done
    Creating network "prefect-server" with the default driver
    Creating tmp_postgres_1 ... error
    
    ERROR: for tmp_postgres_1  Cannot create container for service postgres: invalid IP address in add-host: ""
    
    ERROR: for postgres  Cannot create container for service postgres: invalid IP address in add-host: ""
    ERROR: Encountered errors while bringing up the project.
    Any ideas? How can I see the docker compose logs?
    r
    l
    • 3
    • 8
  • i

    itay livni

    05/24/2020, 4:09 AM
    Hi - I am trying to follow the example in Dask Cloud Providor. Not changing any code I get a timeout error.
    [2020-05-24 03:45:38] INFO - prefect.FlowRunner | Beginning Flow run for 'Dask Cloud Provider Test'
    [2020-05-24 03:45:38] INFO - prefect.FlowRunner | Starting flow run.
    [2020-05-24 03:45:48] ERROR - prefect.FlowRunner | Unexpected error: OSError("Timed out trying to connect to '<tcp://172.31.44.64:8786>' after 10 s: Timed out trying to connect to '<tcp://172.31.44.64:8786>' after 10 s: connect() didn't finish in time")
    Traceback (most recent call last):
      File "miniconda3/envs/py37moc/lib/python3.7/site-packages/distributed/comm/core.py", line 232, in connect
        _raise(error)
      File "/miniconda3/envs/py37moc/lib/python3.7/site-packages/distributed/comm/core.py", line 213, in _raise
        raise IOError(msg)
    OSError: Timed out trying to connect to '<tcp://172.31.44.64:8786>' after 10 s: connect() didn't finish in time
    Is there a port to open? In the ECS console I do see a cluster generated and closed. https://docs.prefect.io/orchestration/execution/dask_cloud_provider_environment.html#process
    k
    m
    • 3
    • 3
  • k

    Kai Weber

    05/25/2020, 1:41 PM
    Question on installing Prefect. I installed and started the Docker version: • docker run -it -p 8080:8080 --name workflow-machine prefecthq/prefect:latest The command line stopped in Python: C:\Projekte\Lokal\software\develop\GitHub\NodeRed2Python>docker run -it -p 8080:8080 -v C😕Projekte/Lokal/software/develop/GitHub/NodeRed2Python/.prefect:/root/.prefect --name workflow-machine prefecthq/prefect:master Python 3.7.7 (default, May 15 2020, 11:37:57) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.
    >>
    When I open Docker CLI and want to start the UI-Server I get the following error: # prefect server start Traceback (most recent call last): File "/usr/local/bin/prefect", line 8, in <module> sys.exit(cli()) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in call return self.main(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/prefect/cli/server.py", line 279, in start docker_internal_ip = get_docker_ip() File "/usr/local/lib/python3.7/site-packages/prefect/utilities/docker_util.py", line 12, in get_docker_ip ip_route_proc = Popen(["ip", "route"], stdout=PIPE) File "/usr/local/lib/python3.7/subprocess.py", line 800, in init restore_signals, start_new_session) File "/usr/local/lib/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'ip': 'ip' # Sorry if this is a dummy question but where is my fault? I tried to find something about this but could not find anything. Thanks, Kai
    n
    • 2
    • 8
  • j

    jars

    05/26/2020, 12:44 AM
    Hello Prefect Community & Support. Having an issue interacting with Get Projects GraphQL endpoint on Prefect IO. My request:
    query {
      project(limit: 10, offset: 0) {
        name
      }
    }
    My response:
    {
      "errors": [
        {
          "message": "Response not successful: Received status code 400",
          "locations": [],
          "path": [
            "project"
          ],
          "extensions": {
            "code": "INTERNAL_SERVER_ERROR",
            "exception": {
              "name": "ServerError",
              "response": {
                "size": 0,
                "timeout": 0
              },
              "statusCode": 400,
              "result": {
                "errors": [
                  {
                    "extensions": {
                      "path": "$.selectionSet.project",
                      "code": "validation-failed"
                    },
                    "message": "field \"project\" not found in type: 'query_root'"
                  }
                ]
              }
            }
          }
        }
      ],
      "data": null,
      "extensions": {
        "tracing": {
          "version": 1,
          "startTime": "2020-05-26T00:41:36.814Z",
          "endTime": "2020-05-26T00:41:36.819Z",
          "duration": 4909820,
          "execution": {
            "resolvers": [
              {
                "path": [
                  "project"
                ],
                "parentType": "Query",
                "fieldName": "project",
                "returnType": "[project!]!",
                "startOffset": 88849,
                "duration": 4807789
              }
            ]
          }
        }
      }
    }
    I've tried with both JavaScript (using apollo-link) and also CLI using curl. Both give same result.
    n
    • 2
    • 31
  • a

    Adam Roderick

    05/26/2020, 12:44 PM
    The version of pendulum you are using will not install on windows.
    j
    • 2
    • 5
  • w

    Will Milner

    05/26/2020, 1:03 PM
    for prefect core, is it possible to set context variables for flow runs in kubernetes agent? I tried using the --env flag but whenever I try and run my flow I get this error
    May 26th 2020 at 9:00:18am | agent
    ERROR 
    (400)
    Reason: Bad Request
    HTTP response headers: HTTPHeaderDict({'Audit-Id': '9f273015-9326-4806-8497-6847f42b700b', 'Content-Type': 'application/json', 'Date': 'Tue, 26 May 2020 13:00:18 GMT', 'Content-Length': '549'})
    HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Job in version \"v1\" cannot be handled as a Job: v1.Job.Spec: v1.JobSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.Value: ReadString: expects \" or n, but found {, error found in #10 byte of ...|\"value\": {\"context\":|..., bigger context ...|.CloudTaskRunner\"}, {\"name\": \"prefect\", \"value\": {\"context\": {\"aws_secret\": |...","reason":"BadRequest","code":400}
    I don't have this problem when running on docker agent
    z
    j
    • 3
    • 17
  • a

    Adam Roderick

    05/26/2020, 1:24 PM
    I'm trying to use the Dockerfile approach to building a flow's storage. It's been running for several minutes, but I'm not getting any feedback or log messages. Is there a way to get more verbose output?
    w
    z
    • 3
    • 7
  • d

    Darragh

    05/26/2020, 3:35 PM
    Hey guys, 2 questions for today… • Do you know why I would get an output error with the following format? Please note the dodgy URL printed does not contain http, which is most likely the cause for failure but I’m not sure what it’s trying to find.. “Invalid URL ‘1.2.3.4/graphql/alpha’” • Am I right in assuming that if I want to to run a very simple flow built into Local Storage and with LocalEnvironment, I register it with my server running on amazon, the local executor agent should just pick it up? Or am I missing a magic step..?
    z
    w
    • 3
    • 65
  • w

    Will Milner

    05/26/2020, 4:31 PM
    is there anyway to force prefect to not use docker cache when building flows using docker storage?
    z
    m
    • 3
    • 4
  • w

    Will Milner

    05/26/2020, 4:43 PM
    sorry for all the questions lately, but here's another one: When running a shell task in my flows I keep getting this error -
    May 26th 2020 at 12:39:41pm | prefect.CloudTaskRunner
    ERROR lens
    Failed to set task state with error: ClientError([{'message': "{'cached_inputs': defaultdict(<class 'dict'>, {'command': {'value': {'type': ['Unsupported value: ConstantResult']}}})}", 'locations': [{'line': 6, 'column': 13}], 'path': ['set_task_run_states', 'states', 0, 'id'], 'extensions': {'code': 'INTERNAL_SERVER_ERROR'}}])
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/prefect/engine/cloud/task_runner.py", line 123, in call_runner_target_handlers
        cache_for=self.task.cache_for,
      File "/usr/local/lib/python3.7/site-packages/prefect/client/client.py", line 1104, in set_task_run_state
        version=version,
      File "/usr/local/lib/python3.7/site-packages/prefect/client/client.py", line 226, in graphql
        raise ClientError(result["errors"])
    prefect.utilities.exceptions.ClientError: [{'message': "{'cached_inputs': defaultdict(<class 'dict'>, {'command': {'value': {'type': ['Unsupported value: ConstantResult']}}})}", 'locations': [{'line': 6, 'column': 13}], 'path': ['set_task_run_states', 'states', 0, 'id'], 'extensions': {'code': 'INTERNAL_SERVER_ERROR'}}]
    and then my flow just hangs in a pending state afterwards, any idea what this could be?
    z
    • 2
    • 10
  • q

    Questionnaire

    05/26/2020, 5:54 PM
    Hello folks, I want to make an insert query in Postgres can someone give me sweet example using
    PostgresExecute
    . 🙂
    j
    k
    • 3
    • 2
  • m

    Marwan Sarieddine

    05/27/2020, 12:14 AM
    Hi Everyone, I have been facing an issue with
    S3Result
    - mainly once I use
    S3Result
    - my memory usage is doubled - so I took the time to look through the source code to test the memory usage locally and I believe there is an issue in the implementation - more specifically in the following line in
    s3_result.py
    binary_data = new.serialize_to_bytes(new.value)
    a new object is being created here - requiring twice the memory allocation - at least this is what seems to me to be happening - please correct me if I am wrong here See full
    write
    method below
    def write(self, value: Any, **kwargs: Any) -> Result:
            """
            Writes the result to a location in S3 and returns the resulting URI.
    
            Args:
                - value (Any): the value to write; will then be stored as the `value` attribute
                    of the returned `Result` instance
                - **kwargs (optional): if provided, will be used to format the location template
                    to determine the location to write to
    
            Returns:
                - Result: a new Result instance with the appropriately formatted S3 URI
            """
    
            new = self.format(**kwargs)
            new.value = value
            self.logger.debug("Starting to upload result to {}...".format(new.location))
            binary_data = new.serialize_to_bytes(new.value)
    
            stream = io.BytesIO(binary_data)
    
            ## upload
            from botocore.exceptions import ClientError
    
            try:
                self.client.upload_fileobj(stream, Bucket=self.bucket, Key=new.location)
            except ClientError as err:
                self.logger.error("Error uploading to S3: {}".format(err))
                raise err
    
            self.logger.debug("Finished uploading result to {}.".format(new.location))
            return new
    c
    a
    • 3
    • 7
  • a

    Avi A

    05/27/2020, 12:35 PM
    Question regarding the
    Result
    we longed for: I see that the default is to cache the results per task, and it’s possible to cache per day/today/tomorrow when using the target formatting. Is there a behavior similar to
    cache_validator
    , allowing us to cache the results per set of attributes?
    👀 1
    n
    • 2
    • 12
  • a

    Adam Roderick

    05/27/2020, 12:47 PM
    I'm seeing some lag time when creating a Docker storage object.
    INFO - prefect.Docker | Building the flow's Docker storage...
    The run stays in that state for a long time before finally logging out the steps from docker build. Any suggestions on how I can troubleshoot, or produce more verbose output? Any idea what is going on that would cause a significant delay here?
    n
    • 2
    • 31
Powered by Linen
Title
a

Adam Roderick

05/27/2020, 12:47 PM
I'm seeing some lag time when creating a Docker storage object.
INFO - prefect.Docker | Building the flow's Docker storage...
The run stays in that state for a long time before finally logging out the steps from docker build. Any suggestions on how I can troubleshoot, or produce more verbose output? Any idea what is going on that would cause a significant delay here?
n

nicholas

05/27/2020, 1:02 PM
Hi @Adam Roderick - if you start your agent with the
--verbose
flag, you'll get a lot more insight into what's happening with the Docker build step
As for what could be causing the delay, my first guess is that Docker could be hitting some resource ceilings, I'm curious what that output will be though
It could also just be a very large image
Sorry, just realized I read that wrong, you're seeing that delay when building the storage at the registration step, right?
a

Adam Roderick

05/27/2020, 1:09 PM
☝️ yes exactly that.
flow.storage = Docker(...
then
flow.register(...
n

nicholas

05/27/2020, 1:15 PM
Ah ok, I don't think there's a good way to get insight into what's going on there; the code for that step is pretty straightforward. Which OS are you running?
a

Adam Roderick

05/27/2020, 1:16 PM
WSL on Windows
It does have a small amount of lag normally. But nothing like this
It's Ubuntu 20.04 LTS
Could the code be downloading a large docker base image every time, rather than using the locally cached image?
Python:3.7 is the base image in my Dockerfile
n

nicholas

05/27/2020, 1:18 PM
That could be the issue but I think there's normally some pulling output if that's the case, and afaik the Python image isn't anything craz. For reference here's the build step for the Client.
Which version of Docker are you running?
a

Adam Roderick

05/27/2020, 1:20 PM
Docker version 19.03.8, build afacb8b7f0
It looks like WSL and docker combination is slow
n

nicholas

05/27/2020, 1:35 PM
Yeah I'm just reading some of the issues on GitHub about that, sorry for the delay
Unfortunately I don't think there's much Prefect can do to speed that up (or get insight into it) 😕
a

Adam Roderick

05/27/2020, 1:35 PM
When I run a timeit on 'from prefect.environments.storage import Docker` on windows, it takes just under 1 second
When I run the same on WSL, it takes 10-13 seconds
I know that's not a prefect issue--thanks for looking into it
n

nicholas

05/27/2020, 1:36 PM
Wow, that's stark
a

Adam Roderick

05/27/2020, 1:36 PM
If you have any ideas now or in the future, I'm all ears
Yeah--very stark. The other imports from Prefect have no difference
n

nicholas

05/27/2020, 1:38 PM
Oh one option is to use VSCode and its Remote WSL plugin, which'll store code in the WSL filesystem
Since it seems to be a networking issue
a

Adam Roderick

05/27/2020, 1:42 PM
That's the setup I am using
It's a great option for development. But exhibits the same behavior
n

nicholas

05/27/2020, 1:43 PM
Oh no 😧
Well it sounds like the slowness was definitely on Docker's radar at some point, seems like networking was something they were looking to solve with WSL 2 😕 wish I could be of more help here, I'll keep you posted if I come across anything from the team.
a

Adam Roderick

05/27/2020, 1:53 PM
Thanks, I'll do the same
View count: 1