https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • h

    Hui Zheng

    01/15/2021, 5:44 PM
    hello, I have a quick question. Could I turn off the schedule of a flow using
    Prefect cli
    ?
    n
    3 replies · 2 participants
  • r

    Riley Hun

    01/15/2021, 6:13 PM
    Hi everyone, I have a very intriguing question - I have registered my flow on a k8s prefect-server and it's using Dask Gateway as the executor. The flow is actually working just fine, to my surprise... However, I can't see the Dask cluster being created by Dask Gateway in the GCP UI. Therefore, there are no dask workers to execute the job, yet the job is still running fine? Is it even using the Dask executor I specified when I registered the flow?
    m
    8 replies · 2 participants
  • j

    Joseph

    01/15/2021, 8:38 PM
    Is there any way to have a Flow only run a portion of the dag? And if so, is it possible to have a Schedule that only runs a portion of the dag?
    n
    3 replies · 2 participants
  • l

    Lucas Kjaero-Zhang

    01/15/2021, 8:40 PM
    Hi everyone, has anyone seen this error when trying to run a kubernetes agent using rbac?
    agent | Service token file does not exists. Using out of cluster configuration option.
    I’ve confirmed that the service account, role, and. rolebinding all exist on the server. Here’s a screenshot of the pod settings, attaching the rest in a thread
    Issue was that the deployment did not automount the service user credential. The deployment was created through terraform, which defaulted it to false
    n
    5 replies · 2 participants
  • b

    Billy McMonagle

    01/15/2021, 8:42 PM
    Hi there, I'm wondering if there is a recommended way to include sql scripts as part of a flow? Perhaps using a custom base image with the sql files already loaded?
    n
    4 replies · 2 participants
  • j

    jeff n

    01/15/2021, 11:16 PM
    Hello all. I have a flow that can take parameters such as db table and pull data from that table generically. How do I register the same flow to run in prefect cloud with different parameters. For instances the flow pulls from
    projects
    with one set of runs and
    accounts
    with another set of runs but both of the runs are the same flow code and need to run separately.
    n
    7 replies · 2 participants
  • b

    BK Lau

    01/16/2021, 12:31 AM
    Q: if i successfully run a Flow/Task locally and now wanted to run the same flow/task in another remote cluster and I "register" the Flow/task with the Prefect server, its my understanding that only the Flow/Task DAG metadata got serialized to the Prefect Server/Database. My question is how does the full complement of the Flow/Task code and its third-party dependency libraries got deployed or loaded in the remote cluster? Am I missing something here? Who is the deploying the code to the remote cluster??
    n
    6 replies · 2 participants
  • r

    Riley Hun

    01/16/2021, 12:40 AM
    Hello all - I feel like I'm super close to getting my flow productionized and integrated with Dask Gateway. Unfortunately, came across a
    PermissionError
    (see reply for full stacktrace of error). Tried to look through the thread to see if others have encountered a similar error, but no luck. Any insight on this?
    3 replies · 1 participant
  • s

    Sonny

    01/16/2021, 1:44 AM
    whats a right way to submit a spark job from prefect task?
    r
    2 replies · 2 participants
  • h

    Hui Zheng

    01/16/2021, 3:35 AM
    Hello, The prefect cloud seems unstable tonight, We have seen runs get stuck or not connecting to Prefect graphql, and also the prefect dashboard was loading very slow in the past few hours.
    c
    8 replies · 2 participants
  • a

    Amanda Wee

    01/16/2021, 11:56 AM
    If the flow registration
    idempotency_key
    matches the one from the previous flow registration, the flow version is not bumped. However, does the serialised flow get uploaded to storage anyway? I'm too newbie at the codebase (and graphql in particular) to make sense of the details of the flow registration code.
    👀 2
    m
    7 replies · 2 participants
  • t

    Tadas

    01/16/2021, 3:51 PM
    How can I give mapped tasks specific task_run_names?
    m
    1 reply · 2 participants
  • m

    Marwan Sarieddine

    01/16/2021, 4:55 PM
    Hi folks, question about inspecting prefect function tasks, is there a way to retrieve the arguments that are used to run the task. Please see a simple example in the thread.
    2 replies · 1 participant
  • j

    jack

    01/18/2021, 2:30 AM
    Hey Prefect team, I had a question - when we are passing data between flows or getting data through tasks, are we ever storing that data in Prefect's Data Centers or infrastructure even for a small amount of time? I guess this is more of a concern for sensitive PI data (for issues like GDPR etc.) Cheers!
    d
    2 replies · 2 participants
  • a

    Aiden Price

    01/18/2021, 7:16 AM
    Hi Prefect people, I get an occasional error where my Kubernetes agent doesn't deploy a flow run, I'm using the Helm deployed Prefect Server v14.3. I can see this error in the agent's logs;
    [2021-01-17 00:00:11,063] ERROR - Prefect-Kubed | Error while managing existing k8s jobs
    Traceback (most recent call last):
      File "/usr/local/.venv/lib/python3.8/site-packages/prefect/agent/kubernetes/agent.py", line 362, in heartbeat
        self.manage_jobs()
      File "/usr/local/.venv/lib/python3.8/site-packages/prefect/agent/kubernetes/agent.py", line 219, in manage_jobs
        event.last_timestamp
    TypeError: '<' not supported between instances of 'NoneType' and 'datetime.datetime'
    d
    a
    6 replies · 3 participants
  • s

    Sven Teresniak

    01/18/2021, 9:21 AM
    Hi, after upgrading from 0.14.1 to 0.14.3 I got the error
    ValueError: Multiple flows cannot be used with the same resource block
    . In which direction I have to search for a solution? We heavily rely on a ResourceManager available for multiple flows…
    d
    j
    4 replies · 3 participants
  • a

    Adam Roderick

    01/18/2021, 12:42 PM
    Hi! I am looking for a way in prefect cloud to report on successful/failed flow runs over the last week, month etc. What do you recommend? I want to filter by label and date range. Any suggestions? Thanks!
    j
    2 replies · 2 participants
  • m

    Matic Lubej

    01/18/2021, 1:46 PM
    Hello! I have a question about creating complex prefect flows. Is there a prefect native way to achieve a nested tree-like process, where multiple tasks are spawned from single tasks and some of them are later merged again? IDK how to best describe it and I'm not sure how detailed my description should be on this channel, but perhaps this sketching helps
    g
    4 replies · 2 participants
  • m

    Matic Lubej

    01/18/2021, 2:05 PM
    And one other, I guess independent thing. When I tried to run the [flat-mapping](https://docs.prefect.io/core/concepts/mapping.html#flat-mapping) example and vizualized the graph with the flow state, I get the following
  • m

    Matic Lubej

    01/18/2021, 2:06 PM
    I'm guessing that the parts on the right should come out of each specific B blocks?
  • k

    Kieran

    01/18/2021, 2:40 PM
    Hey 👋 , I am new to Prefect Cloud and have hit a task definition snag which I can't seem to solve. We have an ECS Agent deployed on EC2 with the correct API tokens which shows up in our Cloud UI. We have a ECS cluster which we declare when launching the agent but leave the launch type as default (Fargate). Our example Flow is below:
    default_client = docker.from_env()
    FLOW_NAME = "hello-flow"
    flow_schedule = CronSchedule("0 8 * * *")
    flow_storage = Docker(
        base_url=default_client.api.base_url,
        tls_config=docker.TLSConfig(default_client.api.cert),
        registry_url="<http://_________.dkr.ecr.eu-west-2.amazonaws.com/xxxxx/prefect|_________.dkr.ecr.eu-west-2.amazonaws.com/xxxxx/prefect>"
    )
    flow_run_config = ECSRun(
        cpu="512",
        memory="512",
        run_task_kwargs={"requiresCompatibilities": ["FARGATE"], "compatibilities": ["FARGATE"]}
    )
    
    with Flow(
        name=FLOW_NAME,
        schedule=flow_schedule,
        storage=flow_storage,
        run_config=flow_run_config
        ) as flow:
        say_hello()
    
    if is_serializable(flow):
        flow.register(project_name="Test", registry_url=flow_storage)
    else:
        raise TypeError("Flow did not serialise.")
    We are getting the following error from our task logs:
    An error occurred (InvalidParameterException) when calling the RunTask operation: Task definition does not support launch_type FARGATE.
    In an attempt to resolve this issue I tried adding
    run_task_kwargs
    as above but with no luck. Does anyone have any pointers? (I can see from the ECS task definition panel that the
    Compatibilities
    is set to EC2 and
    Requires compatibilities
    is blank and from this thread that could be the cause...)
    j
    s
    +1
    8 replies · 4 participants
  • j

    Jeff Williams

    01/18/2021, 9:51 PM
    Can anyone tell me where to find information about Prefect and TLS support?
    j
    7 replies · 2 participants
  • s

    Sai Srikanth

    01/18/2021, 11:04 PM
    Hey Prefect team, I am trying to connect to AWS? I am trying to integrate by passing ACCESS_KEY and SECRET_KEY but I couldn't.
    f
    j
    4 replies · 3 participants
  • f

    Felix Vemmer

    01/18/2021, 11:57 PM
    Hi everyone, I am getting the following error, when trying to write a pandas dataframe into google Cloud Storage:
    Unexpected error: TypeError("__init__() got an unexpected keyword argument 'client_options'")
    Traceback (most recent call last):
      File "/Users/felixvemmer/.pyenv/versions/3.8.6/envs/automation_beast/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner
        new_state = method(self, state, *args, **kwargs)
      File "/Users/felixvemmer/.pyenv/versions/3.8.6/envs/automation_beast/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 891, in get_task_run_state
        result = self.result.write(value, **formatting_kwargs)
      File "/Users/felixvemmer/.pyenv/versions/3.8.6/envs/automation_beast/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 77, in write
        self.gcs_bucket.blob(new.location).upload_from_string(binary_data)
      File "/Users/felixvemmer/.pyenv/versions/3.8.6/envs/automation_beast/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 41, in gcs_bucket
        client = get_storage_client()
      File "/Users/felixvemmer/.pyenv/versions/3.8.6/envs/automation_beast/lib/python3.8/site-packages/prefect/utilities/gcp.py", line 53, in get_storage_client
        return get_google_client(storage, credentials=credentials, project=project)
      File "/Users/felixvemmer/.pyenv/versions/3.8.6/envs/automation_beast/lib/python3.8/site-packages/prefect/utilities/gcp.py", line 31, in get_google_client
        client = Client(project=project, credentials=credentials)
      File "/Users/felixvemmer/.pyenv/versions/3.8.6/envs/automation_beast/lib/python3.8/site-packages/google/cloud/storage/client.py", line 122, in __init__
        super(Client, self).__init__(
    TypeError: __init__() got an unexpected keyword argument 'client_options'
    I am running a task thats returning a
    pd.DataFrame
    which I am trying to store into Google Cloud Storage:
    pandas_serializer = PandasSerializer(
        file_type='csv'
    )
    
    gcs_result = GCSResult(
        bucket='tripliq-data-lake',
        serializer=pandas_serializer,
        location=f'linkedin_top_posts/{datetime.datetime.now().strftime("%Y%m%d-%H%M%S")}_linkedin_post_likes.csv'
    )
    
    like_linkedin_feed = LikeLinkedInFeed(
        result=gcs_result
    )
    I am not understanding the source code too well, but I think it’s referring to this line in
    site-packages/google/cloud/storage/client.py
    def __init__(
    self,
    project=_marker,
    credentials=None,
    _http=None,
    client_info=None,
    client_options=None,
    ):
    Any help is very much appreciated!
    j
    1 reply · 2 participants
  • a

    Alex Rud

    01/19/2021, 3:42 AM
    Hi… Is there a way to map over a dataframe without producing some intermediary objects that bloat memory?
    2 replies · 1 participant
  • g

    Greg Roche

    01/19/2021, 2:45 PM
    Hi folks, does anyone have any experience with this error when running a
    LocalDaskExecutor
    flow, using a LocalAgent running inside a Docker image?
    TypeError: start() missing 1 required positional argument: 'self'
    Edit: solved, I wasn't initialising the LocalDaskExecutor.
    flow.executor = LocalDaskExecutor  # wrong
    flow.executor = LocalDaskExecutor()  # this works
    j
    3 replies · 2 participants
  • j

    Joël Luijmes

    01/19/2021, 3:29 PM
    How can I make a distinction (run-time) to see wether I’m running the flow directly or from the agent / server? Case: when running on production (i.e. from the kubernetes agent) it should perform a task (actually its a resource manager) - which deploys a cloudsql proxy and connect through that proxy - but if I’m developing I want to connect with localhost database. In theory, its fine if I deploy the cloudsql proxy (resource manager) while developing the flow, but I still need to know at runtime if which hostname it should use to connect with the database.
    j
    3 replies · 2 participants
  • p

    Pedro Machado

    01/19/2021, 4:07 PM
    Hi everyone. I tried to set up the docker agent with
    supervisord
    a while ago and was getting a permission error when trying to access the docker engine. I never got it to work. I'd like to run the Docker Agent inside of a container managed by docker compose. I have a couple of questions about this set up: 1. I am setting the
    restart
    policy to
    always
    Would this be enough to restart the agent if it failed or in case of restart of the host? 2. What is the best way to give the agent access to the docker engine running on the host? Thank you!
    j
    2 replies · 2 participants
  • s

    SK

    01/19/2021, 5:28 PM
    Someone, please share AWS setup for prefect. I did "pip install prefect" but still getting the error
  • s

    SK

    01/19/2021, 5:28 PM
    [xxxxxserverxxxxxx]$ python prefect_sample.py Traceback (most recent call last): File "prefect_sample.py", line 5, in <module> import prefect ImportError: No module named prefect
    m
    g
    16 replies · 3 participants
Powered by Linen
Title
s

SK

01/19/2021, 5:28 PM
[xxxxxserverxxxxxx]$ python prefect_sample.py Traceback (most recent call last): File "prefect_sample.py", line 5, in <module> import prefect ImportError: No module named prefect
m

Michael Adkins

01/19/2021, 5:35 PM
Hi! Could you share the output of
pip --version,
python --version
, and
which prefect
?
s

SK

01/19/2021, 5:41 PM
pip3 --version pip 9.0.3 from /usr/lib/python3.7/site-packages (python 3.7) python3 --version Python 3.7.9
not decided yet about prefect server or cloud...settign up fo rthe first time in AWS
m

Michael Adkins

01/19/2021, 5:47 PM
Could you run the command
which prefect
?
And
pip show prefect
s

SK

01/19/2021, 5:50 PM
ok checking
This is what I see
pip3 show prefect Name: prefect Version: 0.14.3 Summary: The Prefect Core automation and scheduling engine. Home-page: https://www.github.com/PrefectHQ/prefect Author: Prefect Technologies, Inc. Author-email: help@prefect.io License: Apache License 2.0 Location: /usr/local/lib/python3.7/site-packages Requires: pyyaml, tabulate, marshmallow-oneofschema, cloudpickle, msgpack, dask, pytz, marshmallow, urllib3, requests, croniter, distributed, docker, python-slugify, toml, click, pendulum, mypy-extensions, python-box, importlib-resources, python-dateutil
m

Michael Adkins

01/19/2021, 5:53 PM
So here you’re referencing
pip3
and
python3
but in the to command you just used
python
to run your script
The package looks to be installed just fine within python 3
s

SK

01/19/2021, 5:53 PM
let me try python3
Michael, need some help
prefect server start Exception caught; killing services (press ctrl-C to force) Traceback (most recent call last): File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/prefect/cli/server.py", line 347, in start ["docker-compose", "pull"], cwd=compose_dir_path, env=env File "/usr/lib64/python3.7/subprocess.py", line 358, in check_call retcode = call(*popenargs, **kwargs) File "/usr/lib64/python3.7/subprocess.py", line 339, in call with Popen(*popenargs, **kwargs) as p: File "/usr/lib64/python3.7/subprocess.py", line 800, in init restore_signals, start_new_session) File "/usr/lib64/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'docker-compose': 'docker-compose' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/ec2-user/enviroment_name/bin/prefect", line 11, in <module> sys.exit(cli()) File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/click/core.py", line 829, in call return self.main(*args, **kwargs) File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/home/ec2-user/enviroment_name/lib/python3.7/site-packages/prefect/cli/server.py", line 385, in start ["docker-compose", "down"], cwd=compose_dir_path, env=env File "/usr/lib64/python3.7/subprocess.py", line 411, in check_output **kwargs).stdout File "/usr/lib64/python3.7/subprocess.py", line 488, in run with Popen(*popenargs, **kwargs) as process: File "/usr/lib64/python3.7/subprocess.py", line 800, in init restore_signals, start_new_session) File "/usr/lib64/python3.7/subprocess.py", line 1551, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'docker-compose': 'docker-compose'
gettign this error
g

Greg Roche

01/19/2021, 8:54 PM
The error seems pretty clear to me?
docker-compose
couldn't be found on the system, so it needs to be installed, or at least be on the system path so that Prefect can interact with it. FWIW the "installation" section of the docs does mention that both Docker and Docker Compose must be installed in order to run Prefect Server. https://docs.prefect.io/core/getting_started/installation.html#running-the-local-server-and-ui
m

Michael Adkins

01/19/2021, 10:03 PM
I would highly recommend checking out Prefect Cloud first — it’s free and running Prefect Server is pretty complicated.
View count: 1