https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • s

    Sean Talia

    01/28/2021, 5:30 PM
    when using the ECS runconfig, is one of the 3 methods for attaching the ecs task (
    task_definition_arn
    ,
    task_definition_path
    ,
    task_definition
    ) to the flow considered to be canonical? i'm trying to weigh the pros and cons of having some other process like terraform handle management of the task definition (thereby only leaving the ARN to be supplied to prefect) vs. using prefect to handle that
    j
    • 2
    • 13
  • j

    Josh Greenhalgh

    01/28/2021, 7:09 PM
    Is there any movement towards having a more airflow like backfilling feature - its literally the one thing I miss - I would really love to be able to set a start date for a schedule in the past and for the flows to catch up to the present - currently I think the only way to achieve is to make use of the params to differentiate future runs for past and to manually start all the backfill flows - is there perhaps a better way currenty?
    j
    • 2
    • 7
  • r

    raaid

    01/28/2021, 9:39 PM
    Hello! I am creating a proof of concept for my team using Prefect. I am not sure how to implement something, hope this is the right place to ask. I am stringing together 3 tasks in a flow. I want them to run one after the other, but none of them output anything (they interact with external services). Can I do this using the functional API, or do I have to use the Imperative API since there is no data dependency that I pass along?
    j
    • 2
    • 3
  • f

    Felipe Saldana

    01/28/2021, 9:57 PM
    If anyone could provide any guidance that would be great. I am trying to pass an instance of a custom task to a “sub flow”. Just wanted to know if that is even possible. My main flow internally calls StartFlowRun for another flow which can take parameters/results from the main flow. The dependencies and DAG look fine in the UI. If I pass a string for "author_info" my subflow/subtask gets that string. If I try to pass author_instance I get an exception:
    Unexpected error: TypeError("Object of type 'TestTask' is not JSON serializable",)
    TestTask is a custom task type I created and the run method returns a custom return type I created as well.
    f
    • 2
    • 4
  • a

    Adam Brusselback

    01/28/2021, 11:43 PM
    Hey all, hopefully this is isn't just noise. Have the need to integrate workflow management software into my multi-tenant SAAS software. Trying to figure out if Prefect is an okay match for my use case. I have a preset list of flows/tasks I want each tenant to be able to schedule / run independently. Each of these flows need to be parameterized or use context so I can specify things like "sftp connection info", "db connection info", which will differ per-tenant. Some of these flows must be limited to concurrency=1-per-tenant just to throw a wrench into things.
    n
    • 2
    • 5
  • j

    Jan Marais

    01/29/2021, 6:05 AM
    I'm trying to sign in to Prefect Cloud with GitHub. Worked yesterday but now getting this screen:
    n
    • 2
    • 4
  • d

    DJ Erraballi

    01/29/2021, 6:42 AM
    is there a way to dynamically assign resources to a given triggered flow
  • t

    Tim Pörtner

    01/29/2021, 9:33 AM
    hi everyone, i thought you might enjoy the automatically generated name of our flow that just happened to fail we dont know what happend but my colleague @Greg Roche fixed it by reregistering it
    😄 5
    😂 12
    g
    • 2
    • 1
  • b

    Borut Hafner

    01/29/2021, 1:29 PM
    Hello All, I want to one sqlalchemy engine through the flow, to avoid code duplicaton. My simple solution was to make a task which returns sqlalchemy engine object as result. It works when executing it locally, but returns error when executing it on server( cannot pickle local_object ). Is there a solution to this problem or is there a best practice how to share connection in a flow? Thanks!
    🙌 1
    👀 2
    a
    • 2
    • 3
  • a

    Alex Furrier

    01/29/2021, 3:57 PM
    Hello, I’m running into a memory issue with a mapped task that computes an array and then writes it to a DB. I think what is happening is that all the arrays (which can be quite large) are being held in memory. Since I’m writing these arrays to a DB I don’t need them to remain in memory after the mapped task has a Success state. I’m running these flows on a kubernetes container, and after X number of large arrays the memory request is exceeded and the pod is evicted. No other downstream tasks are dependent on the completion of the mapped tasks. Is there a way to tell Prefect to dump the task result cache after it has been mapped successfully? Pseudo code:
    @task
    def get_array_input(db_client, query):
    	return db_client.query(query)
    
    
    @task
    def compute_array_write_to_db(input):
    	array = requests.get_huge_array()
    	db_client.write(array)
    
    with Flow("generate-and-write-arrays",) as generate_and_write_arrays:
    	lots_of_inputs = get_array_input(db_client, query)
    
    	lots_of_inputs.map(compute_array_write_to_db)
        
    # Crashes after certain amount of arrays 
    # due to memory limit    
    generate_and_write_arrays.run()
    j
    • 2
    • 2
  • j

    jeff n

    01/29/2021, 4:33 PM
    I am trying to thinking through a rapid pipeline without the overhead of Kubernetes or Docker images. When I register a flow how does prefect know what code to register. Could I do something like:
    with Flow("Example") as flow:
        ....
    
    def __main__():
       if os.getenv("production") == True:
           flow.schedule(my_scheduler)
        flow.register()
    If I register a flow with that in the main would I be able to control if it has a schedule?
    n
    • 2
    • 2
  • l

    Levi Leal

    01/29/2021, 4:45 PM
    I managed to deploy to ECS a task that runs a local agent. The problem is that this agent is not able to run flows stored with Docker. I tried to chagne the agent on ECS to docker but it doesn't run. Is it possible to deploy and agent that runs docker on ECS? I don't want just to execute the flows on ECS. I need the agent itself to be running on ECS.
    m
    b
    • 3
    • 9
  • j

    Josh

    01/29/2021, 5:06 PM
    Has anyone seen this error before? It’s marked as Critical from the CloudHandler
    Failed to write log with error: HTTPSConnectionPool(host='<http://api.prefect.io|api.prefect.io>', port=443): Max retries exceeded with url: /graphql (Caused by ProtocolError('Connection aborted.', BrokenPipeError(32, 'Broken pipe')))
    In Prefect it says the task failed, but it still seems to be executing in the agent. There were logs before from the agent to the cloud, but no logs afterwards
    n
    • 2
    • 1
  • k

    Kieran

    01/29/2021, 5:24 PM
    Has anyone had an issue with Prefect not.
    Prefect automatically gathers mapped results into a list if they are needed by a non-mapped task
    I have two mapped tasks, one feeding the other, then a third non-mapped task which expects a list. The logger context doesn't appear to be getting passed so I cant get any insight. The UI is showing the child-mapped tasks and I have logged out their content but their results are not being gathered together as suggested in the docs. Any pointers to help would be amazing!
    n
    • 2
    • 13
  • r

    Raphaël Riel

    01/29/2021, 7:25 PM
    Prefect Team! A couple of time now I get disconnected from Prefect Cloud and sent back to login where I Select Google SignIn. A) The first time I land on some Okta page (See Screenshot) B) Then I have to click on the “Prefect” “Work” node. C) Back
    <http://univeral.prefet.io|univeral.prefet.io>
    D) Click on Google Again. E) Now I have Google’s prompt to select which account to use (Personal, Work, etc.) F) Then I’m logged in. Any idea? Can this be related to https://prefect-community.slack.com/archives/CL09KU1K7/p1611786977326500 ?
    n
    m
    b
    • 4
    • 26
  • j

    Josh

    01/30/2021, 3:32 AM
    Got this error. Wondering what the largest entity is. I can trim the string down if needed, but just wondering.
    Failed to write log with error: 413 Client Error: Request Entity Too Large for url: <https://api.prefect.io/graphql>
    c
    • 2
    • 1
  • p

    Peter Roelants

    01/30/2021, 8:36 AM
    Hi Prefect, How can I mount a predefined Docker volume when using the Docker Agent? For example I have a volume named
    "test-volume"
    created with
    docker volume create test-volume
    If I want to mount the named docker volume
    "test-volume"
    with
    prefect agent docker start --volume test-volume:$DOCKER_DIR ...
    , the named volume is never mounted into the runtime started by the agent, a new volume seems to be created and mounted each time.
    • 1
    • 3
  • j

    jspeis

    01/31/2021, 1:43 AM
    Hi, looking at Prefect vs Airflow. AWS' MWAA has a mechanism where you can upload a DAG to a designated s3 bucket and it automatically registers with Airflow. I've read a little about the file based flow idioms in prefect, but is there any mechanism to point at say github repo and automatically register the flows? I know it's just one step but if I were to have a lot of flows I'd love to be able to just store my flows in git, point prefect at my repo and have it manage the registrations based on my source
    a
    d
    • 3
    • 6
  • r

    Ryan Kelly

    01/31/2021, 3:29 PM
    has anyone else had issues with the sample code hanging from the flow-to-flow documentation? https://docs.prefect.io/core/idioms/flow-to-flow.html
    d
    • 2
    • 34
  • a

    Anatoliy Zhyzhkevych

    01/31/2021, 9:51 PM
    👋 I’m here! What’d I miss?
    a
    • 2
    • 1
  • p

    Peyton Runyan

    01/31/2021, 10:58 PM
    Is there a good way to parameterize an environment variable in a flow? Below is my user config. I'd love to be able to pass
    mode
    as a parameter instead of messing with it in the config and re-registering my flow.
    mode = "hard_coded_mode"
    [sql_server]
        server = server
        driver = driver
        dsn = "MYMSSQL"
        user = "${sql_server.${mode}.user}"
        database = "${sql_server.${mode}.database}"
    
        [sql_server.dev]
            user =  dev user
            database = "dev db
    
        [sql_server.hub]
            user = app user
            database = app db
    
        [sql_server.prod]
            user = prod user
            database = prod db
    k
    • 2
    • 2
  • y

    Yash Bhandari

    02/01/2021, 1:34 AM
    Sorry if this is a dumb question but what's the simplest way to run an agent in the background? I've just been using a screen session.
    s
    a
    • 3
    • 5
  • m

    Michael Hadorn

    02/01/2021, 8:25 AM
    Hi there If I pass an object to a task (with the dask runtime). What exactly will happen to this object? I was thinking, it would be serialized and copied for every task. But it seems, that there will also be some logic runned. Is this process documented anywhere? Sure I can give more details, if needed.
    ✅ 1
    d
    • 2
    • 4
  • a

    Adam

    02/01/2021, 12:15 PM
    Hello friends, our CI process is suddenly failing due to *errors in importing GCP / BigQuery / Google Core dependencie*s. We haven’t changed anything on our side so curious to know if anyone else is seeing this issue suddenly. Details are in the thread. Any help would be much appreciated!
    m
    • 2
    • 6
  • g

    Giovanni Giacco

    02/01/2021, 1:06 PM
    Hello guys, we create flows to deal with big time series of GeoTIFF rasters (treated as xarrays). What do you suggest about passing massive data between tasks? Is better to store those files on S3, EFS temporarily or treat them as a regular file and return them from functions?
    m
    • 2
    • 2
  • a

    Adrien Boutreau

    02/01/2021, 3:17 PM
    Hello guys! I'm using this example : https://docs.prefect.io/core/examples/functional_docker.html and I would like to know if it's possible to add a retry if python job fail - I saw how to do it with a task but not CreateContainer function
    j
    k
    • 3
    • 19
  • j

    Joël Luijmes

    02/01/2021, 3:27 PM
    I’m using a Context Manager, but how can I mock the behavior of Prefect such that I can only test the setup function? I tried
    @pytest.fixture
    def mock_resource_manager(monkeypatch):
        mock = MagicMock(return_value=None)
        monkeypatch.setattr("prefect.tasks.core.resource_manager.ResourceManager.__call__", mock)
        return mock
    But this gives weird errors down the road, as I also want to mock prefect.config/context for other purposes.
    ✅ 1
    j
    • 2
    • 5
  • j

    Josh Greenhalgh

    02/01/2021, 4:16 PM
    Has anyone got any thoughts on testing deployment process? I will be running my flows on k8s cluster but would like devs to be able to test out their flows locally before submitting a PR which will deploy the changed flow to the server - it's not completely clear to me what the best process is here so would be good to hear of other peoples approaches?
    ➕ 1
    b
    m
    b
    • 4
    • 15
  • k

    Kilian

    02/01/2021, 4:43 PM
    Hi, I am looking into building pipelines that involve GPU work on a very irregular basis that autoscaling is a must. In my case, dask is not really an option as it interferes with the multiprocessing of pytorchs dataloader and otherwise was rather unstable for my GPU workload. To make it simpler, I would like to start a flow that I know needs GPU on it's own instance and do the processing there. The nodes would need to spin up or down depending on current demand. Any pointers how to best achieve this? Currently, I see different possibilities, but not sure which one is best: • Spin up a new Agent on a GPU node before the flow is scheduled and then using tags • Somehow use KubernetesRun to request a GPU and let kubernetes handle the up and downscaling • Only use prefect to trigger a ECS job
    s
    h
    • 3
    • 14
  • m

    Morgan Omind

    02/01/2021, 4:58 PM
    Hi everyone, nice to join your community 🙂 We face an issue which seems to impact different google cloud platform projects, while it should not. In the thread-attached GCP log, we can see that it refers to both quetzal-omind as well as cartographie-cps projects. We do absolutly not understand why?! Our flow runs are pretty unpredictable since some days, most of the times they simply do not start, often they fail in a http requests loop, and sometimes it runs successfully! We suspect a wrong setup of our prefect agent, but we are pretty new to prefect dev, so we are very not sure of where to look for our issues. Maybe someone here will have some clues to help us 🤞 best!
    👋 1
    👀 1
    k
    f
    • 3
    • 17
Powered by Linen
Title
m

Morgan Omind

02/01/2021, 4:58 PM
Hi everyone, nice to join your community 🙂 We face an issue which seems to impact different google cloud platform projects, while it should not. In the thread-attached GCP log, we can see that it refers to both quetzal-omind as well as cartographie-cps projects. We do absolutly not understand why?! Our flow runs are pretty unpredictable since some days, most of the times they simply do not start, often they fail in a http requests loop, and sometimes it runs successfully! We suspect a wrong setup of our prefect agent, but we are pretty new to prefect dev, so we are very not sure of where to look for our issues. Maybe someone here will have some clues to help us 🤞 best!
👋 1
👀 1
{
  
"insertId": "fl6tfmd7rpcqwqaig",
  
"jsonPayload": {
    
"_SYSTEMD_UNIT": "kubelet.service",
    
"_SYSTEMD_SLICE": "system.slice",
    
"_CAP_EFFECTIVE": "3fffffffff",
    
"_HOSTNAME": "gke-iguazu-default-pool-f66089ee-s1lr",
    
"SYSLOG_FACILITY": "3",
    
"_PID": "1087",
    
"_SYSTEMD_INVOCATION_ID": "3eebf72f9a3b413681b38725870a2629",
    
"_BOOT_ID": "fa0576989dd2496b8c759f2a2e34514a",
    
"_GID": "0",
    
"_STREAM_ID": "d30fb7ed4a014871a2f8970efe80c5ad",
    
"SYSLOG_IDENTIFIER": "kubelet",
    
"PRIORITY": "6",
    
"_TRANSPORT": "stdout",
    
"_COMM": "kubelet",
    
"MESSAGE": "E0201 16:20:44.424380    1087 pod_workers.go:191] Error syncing pod 171b3eee-639d-4da7-9deb-a155f1c87c4a (\"prefect-job-a2802066-vjmf9_default(171b3eee-639d-4da7-9deb-a155f1c87c4a)\"), skipping: failed to \"StartContainer\" for \"flow\" with ImagePullBackOff: \"Back-off pulling image \\\"<http://eu.gcr.io/quetzal-omind/iguazu/dev/flows/bilan_vr:2020-09-17t18-14-03-371762-00-00|eu.gcr.io/quetzal-omind/iguazu/dev/flows/bilan_vr:2020-09-17t18-14-03-371762-00-00>\\\"\"",
    
"_MACHINE_ID": "6f4f58b7702e59ad315033846174af48",
    
"_SYSTEMD_CGROUP": "/system.slice/kubelet.service",
    
"_EXE": "/home/kubernetes/bin/kubelet",
    
"_UID": "0",
    
"_CMDLINE": "/home/kubernetes/bin/kubelet --v=2 --cloud-provider=gce --experimental-check-node-capabilities-before-mount=true --experimental-mounter-path=/home/kubernetes/containerized_mounter/mounter --cert-dir=/var/lib/kubelet/pki/ --cni-bin-dir=/home/kubernetes/bin --kubeconfig=/var/lib/kubelet/kubeconfig --image-pull-progress-deadline=5m --experimental-kernel-memcg-notification=true --max-pods=110 --non-masquerade-cidr=0.0.0.0/0 --network-plugin=kubenet --node-labels=<http://cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos|cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos> --volume-plugin-dir=/home/kubernetes/flexvolume --bootstrap-kubeconfig=/var/lib/kubelet/bootstrap-kubeconfig --node-status-max-images=25 --registry-qps=10 --registry-burst=20 --config /home/kubernetes/kubelet-config.yaml --pod-sysctls=net.core.somaxconn=1024,net.ipv4.conf.all.accept_redirects=0,net.ipv4.conf.all.forwarding=1,net.ipv4.conf.all.route_localnet=1,net.ipv4.conf.default.forwarding=1,net.ipv4.ip_forward=1,net.ipv4.tcp_fin_timeout=60,net.ipv4.tcp_keepalive_intvl=75,net.ipv4.tcp_keepalive_probes=9,net.ipv4.tcp_keepalive_time=7200,net.ipv4.tcp_rmem=4096 87380 6291456,net.ipv4.tcp_syn_retries=6,net.ipv4.tcp_tw_reuse=0,net.ipv4.tcp_wmem=4096 16384 4194304,net.ipv4.udp_rmem_min=4096,net.ipv4.udp_wmem_min=4096,net.ipv6.conf.default.accept_ra=0,<http://net.netfilter.nf|net.netfilter.nf>_conntrack_generic_timeout=600,<http://net.netfilter.nf|net.netfilter.nf>_conntrack_tcp_timeout_close_wait=3600,<http://net.netfilter.nf|net.netfilter.nf>_conntrack_tcp_timeout_established=86400"
  
},
  
"resource": {
    
"type": "k8s_node",
    
"labels": {
      
"cluster_name": "iguazu",
      
"project_id": "cartographie-cps",
      
"node_name": "gke-iguazu-default-pool-f66089ee-s1lr",
      
"location": "europe-west4-a"
    
}
  
},
  
"timestamp": "2021-02-01T16:20:44.424424Z",
  
"labels": {
    
"<http://gke.googleapis.com/log_type|gke.googleapis.com/log_type>": "system"
  
},
  
"logName": "projects/cartographie-cps/logs/kubelet",
  
"receiveTimestamp": "2021-02-01T16:20:49.212486852Z"
}
more details perhaps! We deploy our prefect flows in docker images on gcp kubernetes clusters.
k

Kyle Moon-Wright

02/01/2021, 8:54 PM
Hey @Morgan Omind, This a bit difficult to parse, but it looks like your Prefect project is
"cartographie-cps"
, and the only reference to
quetzal-omind
I see is your storage method - containerized code retrieved from your GCR registry. Your Agent will pull the image from this registry URL at runtime for execution inside it’s environment. Keep in mind that your code storage method is independent from your Prefect Project directory in your tenant. I’m not sure how best to help here, but happy to answer any questions you may have to clarify further.
m

Morgan Omind

02/01/2021, 9:09 PM
Thx @Kyle Moon-Wright for your clarification and your time investigating our issue. It confirms what we suspected! So, I guess, my misunderstanding now is how to setup the prefect agent in order to link to the desired gcp registry according to a specific prefect project?
k

Kyle Moon-Wright

02/01/2021, 9:20 PM
Well first off, you should definitely be using a Kubernetes Agent on your cluster and ensure that you’ve added the necessary dependencies to your image for the Agent - in this case, we’ll likely need to have the:
pip install prefect[kubernetes]
and maybe
pip install prefect[google]
as optional dependencies as described here. Otherwise, your Agent doesn’t need access to a specified Prefect Project (which has very little bearing on execution) but it does need access to the GCP registry to pull the image, as well as other RBAC /network permissions on the GCP side to submit Jobs as containers from the image you pulled.
I’m definitely no K8s expert, but it is the most widely used Agent for production purposes, so there are definitely lots of docs and threads here on its usage.
m

Morgan Omind

02/02/2021, 9:51 AM
Ok, thanks for the hints. Sadly, I already checked rbac permissions, and follow instructions to setup them accordingly. But I may have done it wrongly, Im gonna double-check this 😉 And of course we are going to continue to look for docs and threads by the community about this topic. However, what seems very weird to me, is the randomness aspect of our flow runs. As I previously explained, almost none of them succeed (but sometimes it actually succeed, but we dont know why and what is the difference with other runs), but most of the time, either runs stay in pending state, and are relaunch by the lazarus process, until its maximum relaunch trials and then it fails, either it launches, but after the flow starts, it enters an infinite http loop requests, and eventually fails. I join the log of this loop :
15:47:20
DEBUG
urllib3.connectionpool
Starting new HTTPS connection (1): <http://api.prefect.io:443|api.prefect.io:443>
15:47:20
DEBUG
urllib3.connectionpool
<https://api.prefect.io:443> "POST /graphql HTTP/1.1" 200 45
this random behavior, and the fact that we are not able to identify any pattern about it makes debugging very frustrating and unpredictable.. I guess this randomness is not a normal/common behavior, but I wonder what we made wrong to get this weird behavior?
moreover, there are many logs in gcp that seem very weird, thus we may have configuration issues.. But one of the most weird, to my opinion, is the following:
{
"textPayload": "[2021-02-02 10:44:58,462] DEBUG - agent | No flow runs found\n",
"insertId": "28isj4g32dpqfj",
"resource": {
"type": "k8s_container",
"labels": {
"namespace_name": "default",
"container_name": "k8s-agent",
"pod_name": "prefect-1612261772-k8s-agent-76968b94dc-vrb48",
"project_id": "cartographie-cps",
"cluster_name": "iguazu",
"location": "europe-west1-c"
}
},
"timestamp": "2021-02-02T10:44:58.463191937Z",
"severity": "INFO",
"labels": {
"<http://compute.googleapis.com/resource_name|compute.googleapis.com/resource_name>": "gke-iguazu-default-pool-4df57660-prlp",
"k8s-pod/app": "prefect",
"k8s-pod/app_kubernetes_io/version": "0.10.7",
"k8s-pod/app_kubernetes_io/instance": "prefect-1612261772",
"k8s-pod/pod-template-hash": "76968b94dc",
"k8s-pod/app_kubernetes_io/component": "k8s-agent",
"k8s-pod/app_kubernetes_io/managed-by": "Helm",
"k8s-pod/helm_sh/chart": "prefect-0.1.0",
"k8s-pod/app_kubernetes_io/name": "prefect"
},
"logName": "projects/cartographie-cps/logs/stdout",
"receiveTimestamp": "2021-02-02T10:45:02.024152388Z"
}
do you have any idea on what could happen here to get such a 'no flow runs found' log?
here is the query log (sorry to post so many verbose logs !!)
{
"textPayload": "[2021-02-02 10:48:19,205] DEBUG - agent | Querying for flow runs\n",
"insertId": "jzrqvyg3j63ez5",
"resource": {
"type": "k8s_container",
"labels": {
"project_id": "cartographie-cps",
"container_name": "k8s-agent",
"namespace_name": "default",
"location": "europe-west1-c",
"cluster_name": "iguazu",
"pod_name": "prefect-1612261772-k8s-agent-76968b94dc-vrb48"
}
},
"timestamp": "2021-02-02T10:48:19.208557067Z",
"severity": "INFO",
"labels": {
"k8s-pod/helm_sh/chart": "prefect-0.1.0",
"k8s-pod/pod-template-hash": "76968b94dc",
"k8s-pod/app_kubernetes_io/component": "k8s-agent",
"k8s-pod/app": "prefect",
"<http://compute.googleapis.com/resource_name|compute.googleapis.com/resource_name>": "gke-iguazu-default-pool-4df57660-prlp",
"k8s-pod/app_kubernetes_io/name": "prefect",
"k8s-pod/app_kubernetes_io/instance": "prefect-1612261772",
"k8s-pod/app_kubernetes_io/managed-by": "Helm",
"k8s-pod/app_kubernetes_io/version": "0.10.7"
},
"logName": "projects/cartographie-cps/logs/stdout",
"receiveTimestamp": "2021-02-02T10:48:22.022280350Z"
}
f

Florent VanDeMoortele

02/02/2021, 5:35 PM
Hi @Kyle Moon-Wright Thank you for your help. I work with @Morgan Omind and I complete with our last tests. We had 2 k8 agents (one in version 0.10.7 and the other in version 0.12.5). The 0.12.5 runs continuously but the 0.10.7 pauses. In the GCP logs, we notes that our flows called the one in version 0.10.7. So we killed 0.10.7 and updated the version of k8 in Helm's values.yml. After redeployment, we note that the flow finally calls the correct agent (v. 0.12.5). First good news ! But now, when I start the flow, it remains pending for several minutes. I finally cancel it and relaunch it immediately. This time the flow runs perfectly. When I restart the flow again, it remains pending. I cancel it to restart it and so on. This is weird (and seems random) but maybe I misunderstand something with instantiation and startup time... Do you have any idea?
k

Kyle Moon-Wright

02/02/2021, 6:50 PM
Hey @Florent VanDeMoortele, This is definitely a tough one, especially because I’m not too familiar with the Helm chart implementation of Prefect Server and troubleshooting there. However given you last note, I would definitely recommend making sure all versions of both Prefect and Python are the same across the board, as this can cause a multitude of issues - particularly the Prefect version which is currently on 0.14.5. Every version increase presents new functionality/configurability and this is quite important for the Agent, as Agents will not be able to run flows using a higher version of Prefect. This Agent misbehavior could be a result of these mismatches, but it’s hard to tell given it’s a Server implementation. Flows staying in pending definitely sounds networking related and I recall something similar in lesser versions of Prefect, so an update could be fruitful. If this persists however, this may be better triaged via a Github issue/discussion - however do note that I believe the Helm Chart implementation was user-contributed so our ability to help with very specific issues may be limited. Hope that makes sense!
f

Florent VanDeMoortele

02/03/2021, 9:56 AM
Thank you very much for the help and for your time @Kyle Moon-Wright. Indeed, it is a good track. We will try to redeploy all the flows and the project with the latest versions. And we'll see. How the agent works remains a mystery to us!
View count: 1