This message was deleted.
# ask-marvin
s
This message was deleted.
m
thought for 235.1 seconds
Thanks for all the detail — super helpful. Given that: - The first task (
get_destination_invoices
) completes and the second (
get_source_invoices
) hangs with no logs, and - Things worked until you restarted Ray on the VM, the most likely culprit is the Ray worker environment after the restart. In practice, Oracle client connections (OCI/Autonomous DB) usually depend on local env vars and files (wallet,
TNS_ADMIN
,
LD_LIBRARY_PATH
, Instant Client location, etc.). When Ray is restarted (often via systemd or a different login context), those env vars and file paths are not inherited by the Ray worker processes — even if they exist for your interactive venv. That leads to tasks stalling inside the DB connect and never emitting logs. What to check first - Verify Ray connectivity and resources: -
ray status
and the Ray dashboard; ensure there’s free CPU and your task isn’t stuck pending resources. - Look at VM logs under
~/.ray/session_latest/logs/*
for errors in the worker processes. - Confirm the environment Ray workers see: - They do not inherit your shell’s environment. If you rely on
TNS_ADMIN
,
LD_LIBRARY_PATH
,
PATH
pointing to Instant Client, wallet files, or CA paths, those must be present in the Ray worker environment. A quick diagnostic task Run this under your existing
RayTaskRunner
to confirm what the Ray workers see:
Copy code
from prefect import task

@task
def debug_env():
    import os
    try:
        import oracledb
        oracle_ok = True
        oracle_ver = getattr(oracledb, "__version__", "unknown")
    except Exception as e:
        oracle_ok = False
        oracle_ver = str(e)
    return {
        "TNS_ADMIN": os.getenv("TNS_ADMIN"),
        "LD_LIBRARY_PATH": os.getenv("LD_LIBRARY_PATH"),
        "PATH": os.getenv("PATH"),
        "ORACLE_IMPORT": oracle_ok,
        "ORACLE_VER_OR_ERROR": oracle_ver,
    }
If
TNS_ADMIN
is empty or
oracledb
import fails in this task, that’s your smoking gun. How to fix it Option A — pass env to Ray via Prefect’s RayTaskRunner The
RayTaskRunner
lets you pass Ray
runtime_env
via
init_kwargs
. Use this to propagate the required env vars and, if needed, install deps.
Copy code
from prefect import flow
from prefect_ray.task_runners import RayTaskRunner

ray_runner = RayTaskRunner(
    # If connecting to a remote Ray head, include its address; otherwise a local Ray will be started
    # address="ray://<head-node-ip>:10001",
    init_kwargs={
        "runtime_env": {
            "env_vars": {
                "TNS_ADMIN": "/opt/oracle/wallet",
                "LD_LIBRARY_PATH": "/opt/oracle/instantclient_21_11",
                "PATH": "/opt/oracle/instantclient_21_11:$PATH",
                # any other env vars your Oracle client needs
            },
            # If workers may not have the packages, you can also do:
            # "pip": ["oracledb==<version>", "prefect==2.*", "prefect-ray==*"],
            # "working_dir": "/path/to/your/code"  # if needed
        }
    },
)

@flow(task_runner=ray_runner)
def your_flow():
    ...
Option B — start Ray with the correct environment If you manage Ray with a systemd service or a startup script, export the variables there so Ray’s processes inherit them after restart, e.g. in the systemd unit: - Environment= lines for
TNS_ADMIN
,
LD_LIBRARY_PATH
,
PATH
- Ensure the wallet directory and Instant Client are readable by the user running Ray Restart Ray after setting these and verify with the
debug_env
task above. Extra safeguards - Add timeouts to your DB code so a broken network/client doesn’t hang forever: - In
oracledb.connect(..., timeout=30)
and/or set
conn.call_timeout = 30000
before executing queries. - Try running the flow with the default task runner (no Ray) to confirm the DB code itself is healthy. - Make sure you’re passing
address="ray://<host>:10001"
if you intend to use a remote Ray head node; after a restart, the address/port might differ if the startup command changed.
A couple of targeted questions to confirm the path forward - How are you starting Ray on the VM (systemd service, manual
ray start
, Docker)? Can you share that command or unit file? - Are you using Oracle Instant Client thick mode or thin mode with a wallet? Where is the wallet located on disk? - Do you set
TNS_ADMIN
,
LD_LIBRARY_PATH
, etc. in your shell profile only, or are they guaranteed for the Ray service user too? - What versions are you on for Prefect 2,
prefect-ray
,
ray
, and
oracledb
? Docs for reference - Prefect 2 Ray task runner: RayTaskRunner (Prefect 2 docs) - Ray runtime environments: Ray runtime_env docs If you can share the Ray start method and a
debug_env
task result, I can help you pin down the exact env deltas and give a concrete patch.