Vander
09/15/2025, 4:18 PMMarvin
09/15/2025, 4:18 PMMarvin
09/15/2025, 4:23 PMpip install playwright
succeeds.
Recommended approaches:
- Docker work pool with a Playwright base image (best): Build an image from <http://mcr.microsoft.com/playwright/python|mcr.microsoft.com/playwright/python>
, add your code/deps, and run in a Docker work pool.
- Kubernetes work pool: Same idea—use a Playwright-enabled image.
- Process/EC2 worker: Run a pull worker on infrastructure where Playwright is already installed.
Example (Prefect 3.x) using a Docker work pool:
- Dockerfile
FROM <http://mcr.microsoft.com/playwright/python:latest|mcr.microsoft.com/playwright/python:latest>
# Optional: pin Python version tag above for stability
RUN pip install prefect playwright
# If you need browsers explicitly (the base image already has them)
# RUN python -m playwright install --with-deps chromium
# Add your code
WORKDIR /app
COPY . /app
- Deploy your flow to the Docker work pool (Python API)
from prefect import flow
@flow
def my_playwright_flow():
# your Playwright code here
pass
# Assuming you've created a Docker work pool named "docker-pool"
my_playwright_flow.from_source(".").deploy(
name="playwright-deploy",
work_pool_name="docker-pool",
job_variables={"image": "your-registry/your-playwright-image:tag"}
)
If you absolutely must try Managed, the only lever you have is pip_packages
and hoping Playwright can download a browser at runtime—however, the required system libraries won’t be present, so this is not a supported or reliable path.
Docs and references:
- Managed work pools overview and limits: Managed work pools (Prefect 3)
- Work pools (Docker/Kubernetes) management: Manage work pools
- Background on work pools and when to use pull vs. managed: Work pools and workers blog
If you share which environment you prefer (Docker, K8s, or a process worker) and where you host images, I can give you a tailored Dockerfile and deployment steps.Vander
09/15/2025, 4:26 PMMarvin
09/15/2025, 4:26 PMMarvin
09/15/2025, 4:26 PMfrom prefect import flow
from playwright.sync_api import sync_playwright
@flow
def run_playwright():
with sync_playwright() as p:
# Browserless or your own Chrome with remote debugging enabled
browser = p.chromium.connect_over_cdp("<wss://your-browser-host.example/ws?token=...>")
context = browser.contexts[0] if browser.contexts else browser.new_context()
page = context.new_page()
page.goto("<https://example.com>")
print(page.title())
Deployment job variables for Managed pool could include:
{
"pip_packages": [
"playwright"
]
}
- Selenium to a cloud/grid
from prefect import flow
from selenium import webdriver
from <http://selenium.webdriver.common.by|selenium.webdriver.common.by> import By
@flow
def run_selenium():
options = webdriver.ChromeOptions()
driver = webdriver.Remote(
command_executor="https://<user>:<key>@hub.browserstack.com/wd/hub",
options=options,
)
driver.get("<https://example.com>")
print(driver.title)
driver.quit()
Deployment job variables:
{
"pip_packages": [
"selenium"
]
}
Notes
- Store tokens/keys for remote services as Prefect variables or environment variables, not in code.
- Managed pools support pip packages but not system packages or custom images. If you ever need to run browsers locally, switch to a Docker/Kubernetes work pool.
Docs on Managed pool capabilities/limits: Managed work pools (Prefect 3)
If you tell me whether you prefer a hosted service or self-hosting a small browser container, I can share a tailored setup and deployment snippet.