Hi everyone, After installation (step by step belo...
# prefect-server
b
Hi everyone, After installation (step by step below), the dashboard shows up, but under "API Status" notes "Network error: Failed to fetch" and asks whether the graphql_url field in the config file was set. After going through the docs, I'm not sure under which TOML section this field should be, let alone the correct value. Furthermore, not sure this is really the underlying issue, because there are some "connection refused" errors about Postgres in the server logs upon startup. First time user, grateful for any leads.
z
Hi @Burkhard Hoeckendorf, let's start with the Postgres errors you mentioned and work our way up. Do you mind sharing the error logs you're seeing there?
b
Sure, just 1 sec
prefect backend server
Backend switched to server
prefect server start
Pulling postgres ... done Pulling hasura ... done Pulling graphql ... done Pulling apollo ... done Pulling scheduler ... done Pulling ui ... done Creating network "prefect-server" with the default driver Creating tmp_postgres_1 ... done Creating tmp_hasura_1 ... done Creating tmp_graphql_1 ... done Creating tmp_scheduler_1 ... done Creating tmp_apollo_1 ... done Creating tmp_ui_1 ... done Attaching to tmp_postgres_1, tmp_hasura_1, tmp_graphql_1, tmp_apollo_1, tmp_scheduler_1, tmp_ui_1 hasura_1 | {"type":"startup","timestamp":"2020-07-21T194320.646+0000","level":"info","detail":{"kind":"server_configuration","info":{"live_query_options":{"batch_size":100,"refetch_delay":1},"transaction_isolation":"ISOLATION LEVEL READ COMMITTED","plan_cache_options":{"plan_cache_size":100},"enabled_log_types":["http-log","websocket-log","startup","webhook-log"],"server_host":"HostAny","enable_allowlist":false,"log_level":"info","auth_hook_mode":null,"use_prepared_statements":true,"unauth_role":null,"stringify_numeric_types":false,"enabled_apis":["metadata","graphql","config","pgdump"],"enable_telemetry":true,"enable_console":true,"auth_hook":null,"jwt_secret":null,"cors_config":{"allowed_origins":"*","disabled":false,"ws_read_cookie":null},"console_assets_dir":null,"admin_secret_set":false,"port":3000}}} hasura_1 | {"type":"startup","timestamp":"2020-07-21T194320.646+0000","level":"info","detail":{"kind":"postgres_connection","info":{"retries":1,"database_url":"postgresql://prefect:...@postgres:5432/prefect_server"}}} postgres_1 | The files belonging to this database system will be owned by user "postgres". postgres_1 | This user must also own the server process. postgres_1 | postgres_1 | The database cluster will be initialized with locale "en_US.utf8". postgres_1 | The default database encoding has accordingly been set to "UTF8". postgres_1 | The default text search configuration will be set to "english". postgres_1 | postgres_1 | Data page checksums are disabled. graphql_1 | graphql_1 | Running Alembic migrations... apollo_1 | apollo_1 | > @ serve /apollo apollo_1 | > node dist/index.js apollo_1 | graphql_1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl. ui_1 | 👾👾👾 UI running at localhost:8080 👾👾👾 postgres_1 | postgres_1 | fixing permissions on existing directory /var/lib/postgresql/data ... ok scheduler_1 | [2020-07-21 194329,448] INFO - prefect-server.Scheduler | Scheduler will start after an initial delay of 295 seconds... hasura_1 | {"internal":"could not connect to server: Connection refused\n\tIs the server running on host \"postgres\" (172.29.0.2) and accepting\n\tTCP/IP connections on port 5432?\n","path":"$","error":"connection error","code":"postgres-error"} graphql_1 | INFO [alembic.runtime.migration] Will assume transactional DDL. postgres_1 | creating subdirectories ... ok graphql_1 | INFO [alembic.runtime.migration] Running upgrade -> 723da8052cbd, create extensions graphql_1 | graphql_1 | Could not upgrade the database! graphql_1 | Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "pg_extension_name_index" graphql_1 | DETAIL: Key (extname)=(pgcrypto) already exists. graphql_1 | graphql_1 | [SQL: graphql_1 | CREATE EXTENSION IF NOT EXISTS "pgcrypto"; graphql_1 | CREATE EXTENSION IF NOT EXISTS "pg_trgm"; graphql_1 | ] graphql_1 | (Background on this error at: http://sqlalche.me/e/13/gkpj) postgres_1 | selecting default max_connections ... 100 postgres_1 | selecting default shared_buffers ... 128MB postgres_1 | selecting default timezone ... Etc/UTC postgres_1 | selecting dynamic shared memory implementation ... posix postgres_1 | creating configuration files ... ok postgres_1 | running bootstrap script ... ok postgres_1 | performing post-bootstrap initialization ... ok postgres_1 | syncing data to disk ... postgres_1 | WARNING: enabling "trust" authentication for local connections postgres_1 | You can change this by editing pg_hba.conf or using the option -A, or postgres_1 | --auth-local and --auth-host, the next time you run initdb. postgres_1 | ok postgres_1 | postgres_1 | Success. You can now start the database server using: postgres_1 | postgres_1 | pg_ctl -D /var/lib/postgresql/data -l logfile start postgres_1 | postgres_1 | waiting for server to start....2020-07-21 194320.038 UTC [46] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" postgres_1 | 2020-07-21 194320.448 UTC [47] LOG: database system was shut down at 2020-07-21 194317 UTC postgres_1 | 2020-07-21 194320.627 UTC [46] LOG: database system is ready to accept connections postgres_1 | done postgres_1 | server started postgres_1 | CREATE DATABASE postgres_1 | postgres_1 | postgres_1 | /usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/* postgres_1 | postgres_1 | 2020-07-21 194323.726 UTC [46] LOG: received fast shutdown request postgres_1 | waiting for server to shut down....2020-07-21 194323.796 UTC [46] LOG: aborting any active transactions postgres_1 | 2020-07-21 194323.800 UTC [46] LOG: background worker "logical replication launcher" (PID 53) exited with exit code 1 postgres_1 | 2020-07-21 194323.800 UTC [48] LOG: shutting down postgres_1 | 2020-07-21 194324.292 UTC [46] LOG: database system is shut down postgres_1 | done postgres_1 | server stopped postgres_1 | postgres_1 | PostgreSQL init process complete; ready for start up. postgres_1 | postgres_1 | 2020-07-21 194324.420 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 postgres_1 | 2020-07-21 194324.420 UTC [1] LOG: listening on IPv6 address "::", port 5432 postgres_1 | 2020-07-21 194324.642 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" postgres_1 | 2020-07-21 194325.113 UTC [64] LOG: database system was shut down at 2020-07-21 194324 UTC postgres_1 | 2020-07-21 194325.115 UTC [65] FATAL: the database system is starting up postgres_1 | 2020-07-21 194325.172 UTC [1] LOG: database system is ready to accept connections postgres_1 | 2020-07-21 194328.479 UTC [73] ERROR: duplicate key value violates unique constraint "pg_extension_name_index" postgres_1 | 2020-07-21 194328.479 UTC [73] DETAIL: Key (extname)=(pgcrypto) already exists. postgres_1 | 2020-07-21 194328.479 UTC [73] STATEMENT: postgres_1 | CREATE EXTENSION IF NOT EXISTS "pgcrypto"; postgres_1 | CREATE EXTENSION IF NOT EXISTS "pg_trgm"; postgres_1 | apollo_1 | 2020-07-21T194331.141Z Error fetching GraphQL health: FetchError: request to http://graphql:4201/health failed, reason: connect ECONNREFUSED 172.29.0.3:4201 apollo_1 | 2020-07-21T194331.144Z Error: Could not safely build a schema! apollo_1 | at safelyBuildSchema (/apollo/dist/index.js12911) apollo_1 | at process._tickCallback (internal/process/next_tick.js687) Could not safely build a schema! Error: Could not safely build a schema! apollo_1 | at safelyBuildSchema (/apollo/dist/index.js12911) apollo_1 | at process._tickCallback (internal/process/next_tick.js687) apollo_1 | 2020-07-21T194331.148Z apollo_1 | Trying again in 3 seconds... apollo_1 | apollo_1 | 2020-07-21T194334.158Z Error fetching GraphQL health: FetchError: request to http://graphql:4201/health failed, reason: connect ECONNREFUSED 172.29.0.3:4201 apollo_1 | 2020-07-21T194334.158Z Error: Could not safely build a schema! apollo_1 | at safelyBuildSchema (/apollo/dist/index.js12911) apollo_1 | at process._tickCallback (internal/process/next_tick.js687) Could not safely build a schema! Error: Could not safely build a schema! apollo_1 | at safelyBuildSchema (/apollo/dist/index.js12911) apollo_1 | at process._tickCallback (internal/process/next_tick.js687) apollo_1 | 2020-07-21T194334.158Z apollo_1 | Trying again in 3 seconds... apollo_1 | apollo_1 | 2020-07-21T194337.165Z Error fetching GraphQL health: FetchError: request to http://graphql:4201/health failed, reason: connect ECONNREFUSED 172.29.0.3:4201 apollo_1 | 2020-07-21T194337.165Z Error: Could not safely build a schema! apollo_1 | at safelyBuildSchema (/apollo/dist/index.js12911) apollo_1 | at process._tickCallback (internal/process/next_tick.js687) Could not safely build a schema! Error: Could not safely build a schema! apollo_1 | at safelyBuildSchema (/apollo/dist/index.js12911) apollo_1 | at process._tickCallback (internal/process/next_tick.js687) apollo_1 | 2020-07-21T194337.165Z apollo_1 | Trying again in 3 seconds... apollo_1 | apollo_1 | 2020-07-21T194340.188Z Building schema... apollo_1 | 2020-07-21T194341.125Z Building schema complete! apollo_1 | Server ready at http://0.0.0.0:4200 🚀
Let me know if there is a better way than pasting them here...
z
Do you see this issue deterministically each time you spin up server, or is this the first time you're encountering it?
b
Each time
Here is how I installed: Here is my setup:
conda create -n prefect python
conda activate prefect
conda install -c conda-forge docker-compose prefect
Docker v19.03.12, docker and containerd are running. Here is the conda env: # Name Version Build Channel _libgcc_mutex 0.1 main appdirs 1.4.3 py_1 conda-forge asn1crypto 1.3.0 py38h32f6830_1 conda-forge attrs 19.3.0 py_0 conda-forge brotlipy 0.7.0 py38h1e0a361_1000 conda-forge ca-certificates 2020.6.20 hecda079_0 conda-forge cached-property 1.5.1 py_0 conda-forge certifi 2020.6.20 py38h32f6830_0 conda-forge cffi 1.14.0 py38he30daa8_1 chardet 3.0.4 py38h32f6830_1006 conda-forge click 7.1.2 pyh9f0ad1d_0 conda-forge cloudpickle 1.5.0 py_0 conda-forge croniter 0.3.30 py_0 conda-forge cryptography 3.0 py38h766eaa4_0 conda-forge cytoolz 0.10.1 py38h516909a_0 conda-forge dask-core 2.21.0 py_0 conda-forge distributed 2.21.0 py38h32f6830_0 conda-forge distro 1.5.0 pyh9f0ad1d_0 conda-forge docker-compose 1.26.2 py38h32f6830_0 conda-forge docker-py 4.2.2 py38h32f6830_0 conda-forge docker-pycreds 0.4.0 py_0 conda-forge dockerpty 0.4.1 py_0 conda-forge docopt 0.6.2 py_1 conda-forge heapdict 1.0.1 py_0 conda-forge idna 2.10 pyh9f0ad1d_0 conda-forge importlib-metadata 1.7.0 py38h32f6830_0 conda-forge importlib_metadata 1.7.0 0 conda-forge jsonschema 3.2.0 py38h32f6830_1 conda-forge ld_impl_linux-64 2.33.1 h53a641e_7 libedit 3.1.20191231 h14c3975_1 libffi 3.3 he6710b0_2 libgcc-ng 9.1.0 hdf63c60_0 libstdcxx-ng 9.1.0 hdf63c60_0 marshmallow 3.7.1 py_0 conda-forge marshmallow-oneofschema 2.0.1 py_0 conda-forge msgpack-python 1.0.0 py38hbf85e49_1 conda-forge mypy_extensions 0.4.3 py38h32f6830_1 conda-forge ncurses 6.2 he6710b0_1 openssl 1.1.1g h516909a_0 conda-forge packaging 20.4 pyh9f0ad1d_0 conda-forge pendulum 2.1.1 py38h32f6830_0 conda-forge pip 20.1.1 py38_1 prefect 0.12.4 py_0 conda-forge psutil 5.7.2 py38h1e0a361_0 conda-forge pycparser 2.20 pyh9f0ad1d_2 conda-forge pyopenssl 19.1.0 py_1 conda-forge pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge pyrsistent 0.16.0 py38h1e0a361_0 conda-forge pysocks 1.7.1 py38h32f6830_1 conda-forge python 3.8.3 hcff3b4d_2 python-box 4.2.3 py_0 conda-forge python-dateutil 2.8.1 py_0 conda-forge python-dotenv 0.14.0 pyh9f0ad1d_0 conda-forge python-slugify 4.0.1 pyh9f0ad1d_0 conda-forge python_abi 3.8 1_cp38 conda-forge pytz 2020.1 pyh9f0ad1d_0 conda-forge pytzdata 2020.1 pyh9f0ad1d_0 conda-forge pyyaml 5.3.1 py38h1e0a361_0 conda-forge readline 8.0 h7b6447c_0 requests 2.24.0 pyh9f0ad1d_0 conda-forge ruamel.yaml 0.16.6 py38h1e0a361_1 conda-forge ruamel.yaml.clib 0.2.0 py38h1e0a361_1 conda-forge setuptools 49.2.0 py38_0 six 1.15.0 pyh9f0ad1d_0 conda-forge sortedcontainers 2.2.2 pyh9f0ad1d_0 conda-forge sqlite 3.32.3 h62c20be_0 tabulate 0.8.7 pyh9f0ad1d_0 conda-forge tblib 1.6.0 py_0 conda-forge text-unidecode 1.3 py_0 conda-forge texttable 1.6.2 py_0 conda-forge tk 8.6.10 hbc83047_0 toml 0.10.1 pyh9f0ad1d_0 conda-forge toolz 0.10.0 py_0 conda-forge tornado 6.0.4 py38h1e0a361_1 conda-forge unidecode 1.1.1 py_0 conda-forge urllib3 1.25.9 py_0 conda-forge websocket-client 0.57.0 py38h32f6830_1 conda-forge wheel 0.34.2 py38_0 xz 5.2.5 h7b6447c_0 yaml 0.2.5 h516909a_0 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.1.0 py_0 conda-forge zlib 1.2.11 h7b6447c_3
z
Thanks for sharing! Can you confirm that you're able to register a flow? I ask because it looks like the Postgres error could be a bit of a red herring, and I'm trying to determine if this is a Postgres issue or an issue communicating between the UI and the API.
b
I see. Give me a minute to try.
I can not register a flow. Here is the python output of flow.register():
Result check: OK
Flow: http://localhost:8080/flow/6c37641a-e35c-48a6-a594-93596955d0b1
But the flow does not appear on the dashboard and the url doesn't exist on the server
I don't think here are any new lines on the server console
z
Solid this is actually super helpful. The fact that you got a link back indicates that you were able to persist a flow, but that your UI isn't properly configured to point to your API. Are you running everything on one machine?
b
Yes, form the same Conda env
n
@Burkhard Hoeckendorf do you mind taking a screenshot of your API Status tile? That'll help understand where it's trying to connect.
b
n
Perfect, thank you! Can you visit http://localhost:4200/graphql in your browser?
b
No
n
What does visiting that link show? Nothing at all or a message about a
GET
request?
b
Nothing at all.
n
Ok that's helpful, that means the API isn't where it would normally be. Looking back at your stack trace, it looks like it's at http://0.0.0.0:4200/graphql are you able to visit that link in your browser?
b
Same result. I thought 0.0.0.0 is just localhost, but externally accessible. (I'm doing everything on the same machine).
I also didn't change the default config.
n
They're a little different but it looks like either way the API isn't available at that endpoint either, so whatever network Prefect Server is running on isn't available on your local machine
b
Ok, got it. Thanks for your help diagnosing this. How do we get to the bottom of it?
n
Are you running Prefect Server in an EC2 or some other remote environment?
b
No, it is all on my local machine.
Ok, I just noticed something funny. Even after
colda activate prefect
,
which python
points to the system python, and with this interpreter I can't even
import prefect
.
colda = conda
I can successfully execute a script that uses prefect (without server UI), but when I start python myself I can not import any prefect modules.
Not sure how much my local config matters though, considering the server is containerized.
n
Hm, interesting. That shouldn't impact anything in this case, since this is a series of networked Docker containers.
b
ok
n
I'm at a little bit of a loss, if you're unable to reach the API through your browser by visiting the apollo address (localhost:4200 or 0.0.0.0:4200), I'm not surprised the UI isn't able to connect cause it's trying that same address
b
I'm not aware of anything that would prevent accessing this address.
Is there any way, or would it be worth double checking the service is actually running
n
Well as long as you've not exited the process you started with
prefect server start
, it should be running still
b
Ok
n
you can try to restart it by sigterming the process and starting it again
that might give you some insight if something's going wrong
b
Sure
Looks the same. First encountered error is:
hasura_1 | {"internal":"could not connect to server: Connection refused\n\tIs the server running on host \"postgres\" (172.31.0.2) and accepting\n\tTCP/IP connections on port 5432?\n","path":"$","error":"connection error","code":"postgres-error"}
Might be a race condition? This error appears before the first log output from Postgres.
n
Yeah that one isn't particularly concerning, the various services will retry their connections as other containers start up... hmm 🤔
b
I'm in parallel trying another machine.
n
Smart, I wasn't able to replicate this from a fresh pull myself
b
Looks like this is running fine. Not sure what is going on here.
Whatever it is, seems to be a local issue though.
n
Meaning the second machine worked?
b
Yes
Haven't run anything yet, but dashboard shows no errors.
n
Oof 😓 my only recommendation in that case would be to stop the server, remove all the images with docker prune, and then reinstall prefect in a new env and try again
b
Ok, I'll try that. If I was to stick to the other machine, can I use the server remotely, as in from the first machine?
I meant "stick with", not "stick to" 🙂
n
Totally, you’ll need to set the
graphql_url
server.ui variable in the machine’s config.toml (pointing it to the machine server is running from) and restart the server before you’ll be able to use the UI from your machine.
There are a bunch of other threads with info on that, but if you’re still having trouble after looking at those, feel free to ping us!
b
Ok. Thanks a lot @nicholas and @Zachary Hughes . Apologies that it ended up being a local issue.
🚀 1
z
All good! These things happen. Happy Prefect-ing!