Thread
#prefect-community
    j

    Julio Venegas

    1 year ago
    Hi community! Lately I’m getting a lot of time out errors from tasks, running a LocalDaskExecutor. See error inside thread, any hints on what the issue is?
    Failed to retrieve task state with error: ReadTimeout(ReadTimeoutError("HTTPConnectionPool(host='localhost', port=4200): Read timed out. (read timeout=15)"))
    Traceback (most recent call last):
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/connectionpool.py", line 426, in _make_request
        six.raise_from(e, None)
      File "<string>", line 3, in raise_from
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/connectionpool.py", line 421, in _make_request
        httplib_response = conn.getresponse()
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/http/client.py", line 1347, in getresponse
        response.begin()
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/http/client.py", line 307, in begin
        version, status, reason = self._read_status()
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/http/client.py", line 268, in _read_status
        line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/socket.py", line 669, in readinto
        return self._sock.recv_into(b)
    socket.timeout: timed out
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
        resp = conn.urlopen(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
        retries = retries.increment(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/util/retry.py", line 410, in increment
        raise six.reraise(type(error), error, _stacktrace)
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise
        raise value
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
        httplib_response = self._make_request(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/connectionpool.py", line 428, in _make_request
        self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/urllib3/connectionpool.py", line 335, in _raise_timeout
        raise ReadTimeoutError(
    urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=4200): Read timed out. (read timeout=15)
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/prefect/engine/cloud/task_runner.py", line 154, in initialize_run
        task_run_info = self.client.get_task_run_info(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/prefect/client/client.py", line 1399, in get_task_run_info
        result = self.graphql(mutation)  # type: Any
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/prefect/client/client.py", line 298, in graphql
        result = <http://self.post|self.post>(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/prefect/client/client.py", line 213, in post
        response = self._request(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/prefect/client/client.py", line 459, in _request
        response = self._send_request(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/prefect/client/client.py", line 351, in _send_request
        response = <http://session.post|session.post>(
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/requests/sessions.py", line 590, in post
        return self.request('POST', url, data=data, json=json, **kwargs)
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
        resp = self.send(prep, **send_kwargs)
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
        r = adapter.send(request, **kwargs)
      File "/home/adminuser/miniconda3/envs/py38/lib/python3.8/site-packages/requests/adapters.py", line 529, in send
        raise ReadTimeout(e, request=request)
    requests.exceptions.ReadTimeout: HTTPConnectionPool(host='localhost', port=4200): Read timed out. (read timeout=15)
    Kevin Kho

    Kevin Kho

    1 year ago
    Hey, and this Flow works with this work with LocalExecutor?
    j

    Julio Venegas

    1 year ago
    Hey Kevin! Yep, it works with LocalExecutor but I haven’t run it with it in a while.
    Kevin Kho

    Kevin Kho

    1 year ago
    Are you on cloud or server?
    j

    Julio Venegas

    1 year ago
    Server
    Kevin Kho

    Kevin Kho

    1 year ago
    This error is saying it can’t hit the API so I think it will fail with LocalExecutor. Can you open your server UI and see if the dashboard loads?
    Is this flow big? and do you have other flows that are working fine?
    j

    Julio Venegas

    1 year ago
    Dashboard does load! And yep, it’s a big flow. In some stages it maps around 8k tasks, sometimes it runs around 8 or so in parallel.
    It’s a 16 core machine though
    Kevin Kho

    Kevin Kho

    1 year ago
    Yeah so the API is just timing out cuz of the 15 second limit. We just need to increase the time. Will look how
    j

    Julio Venegas

    1 year ago
    Cool, will have a look! I think I had seen it before.
    But can you please explain a bit the flow between API and the tasks? As in, the tasks are mapped and set to pending, but then they are for some reason not available for longer than 15 seconds and the agent stops polling for them?
    Or the other way around i.e. the agent or task polls the API but the API times out?
    “Agents run inside a user’s architecture, and are responsible for starting and monitoring flow runs. During operation the agent process queries the Prefect API for any scheduled flow runs, and allocates resources for them on their respective deployment platforms.” So I assume that the agent is polling the API to check on the tasks and the API times out.
    Kevin Kho

    Kevin Kho

    1 year ago
    I won’t be able to explain this well and a bunch of the team is out now but my understanding is that each task hits the API each state change, meaning it hits the API at least 3 times. It’s these calls that are timing out.