Hey I am using Prefect Cloud on version `0 13 18` Has anyone Prefect Community #ask-community

Hey, I am using Prefect Cloud on version `0.13.18...

Kieran

03/05/2021, 2:45 PM

Hey, I am using Prefect Cloud on version

0.13.18

. Has anyone had this before -- all of a sudden a large row of cancelled flows appear in the UI? (these flows run using the CronSchedule)

nicholas

03/05/2021, 2:56 PM

Hi @Kieran - do you have any more info you could provide? Those flow runs likely have state messages for why they were cancelled.

Kieran

03/05/2021, 3:00 PM

Hi @nicholas, I am a bit lost, as I look into this, none of these flows were triggered by me or the cronSchedule and none of them were cancelled by me either. This seems to have impacted all of our flows. There are no logs either. Here is a screenshot of the last hour (for context there are no scheduled runs in the last hour)

nicholas

03/05/2021, 3:07 PM

Sorry you're running into this @Kieran - do you have any logs associated with those runs that you can share?

nicholas

03/05/2021, 3:08 PM

For a ID for one of the flows that was impacted would work as well 🙂

Kieran

03/05/2021, 3:13 PM

@nicholas there are no logs returned but here is the Flow ID

Flow 8dda1956-8cb0-4d7a-ad67-477d5da757ad was archived

nicholas

03/05/2021, 3:14 PM

Aha! When a new version of a flow is registered, the old version is archived and any manually-scheduled flow runs will be put in a

Cancelled

state. Is it possible you or someone on your team registered new versions of a bunch of your flows?

Kieran

03/05/2021, 4:25 PM

Yeah, I believe our CI/CD registers our flows on merging in to master but we merge multiple times a day and will only manually trigger a small number of flows (<5). This behaviour of suddenly spawning loads of runs and immediately cancelling them is new (for context we have had this running for a few weeks now).

nicholas

03/05/2021, 4:27 PM

Gotcha - for reference this is the PR that introduced that behavior: https://github.com/PrefectHQ/server/pull/185

nicholas

03/05/2021, 4:27 PM

Previously, registering a new version of a flow would automatically delete all flow runs in a

Scheduled

state, which is why you probably didn't notice this before

Kieran

03/07/2021, 9:59 AM

I see @nicholas. I guess the way I have put our ci/cd together with Prefect is slightly wrong. On merge into master we register every flow, so to capture any new flows. Using flow.register() and tag it as prod if it is against master. This is called at the bottom of flow files inside a function which checks is_serialisable(). I'm guessing from this thread that we should register once and never again? If so, to build an push the image (we use docker storage and push to ecr) do we just call python foo.py on the flow file without this register step?

Kieran

03/07/2021, 9:59 AM

Also, is it possible to fix the version of Prefect Cloud being used in the scheduler?

nicholas

03/08/2021, 2:48 PM

Hi @Kieran - not quite; you'll want to re-register your flows anytime the metadata of the flow changes, e.g. parameters, tasks, dependencies, schedules (set in code, not the UI), etc.

nicholas

03/08/2021, 2:51 PM

To keep your current setup, you can pass the result of

flow.serialize()

to the

idempotency_key

argument to

flow.register()

, which will create a string representation of your flow. If the serialized representation of a flow matches the existing idempotency key, a new version of the flow won't be registered.

Kieran

03/08/2021, 7:54 PM

Thanks @nicholas I will try adding that as at the minute the UI is pretty unusable with all of these cancelled flows?!

nicholas

03/08/2021, 7:54 PM

@Kieran can you explain a bit more? Is it unusable because you can't find the data you need?

Kieran

03/08/2021, 7:59 PM

Sure @nicholas. At the minute we are continually developing our flows and tasks as we tweak and add news things and build more flows. So there are multiple merges into master. At the moment, every time we merge the UI is inundated with 10x grey bars to reflect cancelled future flows. A lot of the UI not only looks really nice (and now it doesn't) but practically returns some basic stats on success of runs -- now this data is dirtied with all of these cancelled runs. It's not the expected behaviour that I thought but your suggestion above may remove all of these cancelled flows, but secondly I don't want the underlying behaviour to change when Prefect roll out a new version -- ideally it would be fixed just like every other packed we use in the project.

nicholas

03/08/2021, 8:01 PM

Gotcha - are you also scheduling runs manually as part of your CI process?

Kieran

03/08/2021, 8:03 PM

i.e.

nicholas

03/08/2021, 8:03 PM

As to the question about pinning the version of Cloud (sorry, I missed that before) - at this time there's no way to pin the version of Cloud, since Core mechanics are completely managed by the version of Prefect (Core) attached to a given flow.

✅ 1

Kieran

03/08/2021, 8:04 PM

No the CI is just to register flow, which builds and publishes the images, The flows (a part from 1 new one which uses paramters) are all controlled by cron schedules.

nicholas

03/08/2021, 8:05 PM

Hm, that sounds like a bug then, let me check with the team to see if there's anything I'm missing; the only runs that should be put in

Cancelled

states are those that aren't created by the Prefect Scheduler

Kieran

03/08/2021, 8:08 PM

Thanks @nicholas

nicholas

03/08/2021, 11:16 PM

Hi @Kieran - whenever you get a chance, could you confirm that your change to the CI script reduces or eliminates the presence of these cancelled runs? I believe we've pushed a fix for this in Cloud in the last few days but I want to be sure this is the same issue.

Kieran

03/09/2021, 9:28 AM

@nicholas would I pass it like this:

Copy code

flow.register(
            project_name=project_name,
            labels=[label],
            add_default_labels=False,
            idempotency_key=flow.serialize()
        )

Kieran

03/09/2021, 9:30 AM

I get this from CI:

Copy code

prefect.utilities.exceptions.ClientError: 400 Client Error: Bad Request for url: <https://api.prefect.io/graphql>

The following error messages were provided by the GraphQL server:

    INTERNAL_SERVER_ERROR: Variable "$input" got invalid value { name: "customers",
        type: "prefect.core.flow.Flow", schedule: { clocks: [Array], or_filters: [],
        adjustments: [], not_filters: [], filters: [], __version__: "0.14.11", type:
        "Schedule" }, parameters: [], tasks: [[Object], [Object], [Object], [Object],
        [Object], [Object], [Object], [Object], [Object], [Object], ... 4 more items],
        edges: [[Object], [Object], [Object], [Object], [Object], [Object], [Object],
        [Object], [Object], [Object], ... 5 more items], reference_tasks: [],
        environment: null, run_config: { task_role_arn: null, run_task_kwargs: null,
        task_definition: null, cpu: "512", execution_role_arn: null, labels: [], image:
        null, task_definition_path: null, env: null, memory: "1024",
        task_definition_arn: null, __version__: "0.14.11", type: "ECSRun" },
        __version__: "0.14.11", storage: { stored_as_script: false, image_tag: null,
        path: null, prefect_version: "0.14.11", registry_url: "************", secrets: [], flows: {}, image_name: null,
        __version__: "0.14.11", type: "Docker" } } at "input.idempotency_key"; Expected
        type String. String cannot represent a non string value: { name: "customers",
        type: "prefect.core.flow.Flow", schedule: { clocks: [Array], or_filters: [],
        adjustments: [], not_filters: [], filters: [], __version__: "0.14.11", type:
        "Schedule" }, parameters: [], tasks: [[Object], [Object], [Object], [Object],
        [Object], [Object], [Object], [Object], [Object], [Object], ... 4 more items],
        edges: [[Object], [Object], [Object], [Object], [Object], [Object], [Object],
        [Object], [Object], [Object], ... 5 more items], reference_tasks: [],
        environment: null, run_config: { task_role_arn: null, run_task_kwargs: null,
        task_definition: null, cpu: "512", execution_role_arn: null, labels: [], image:
        null, task_definition_path: null, env: null, memory: "1024",
        task_definition_arn: null, __version__: "0.14.11", type: "ECSRun" },
        __version__: "0.14.11", storage: { stored_as_script: false, image_tag: null,
        path: null, prefect_version: "0.14.11", registry_url: "************", secrets: [], flows: {}, image_name: null,
        __version__: "0.14.11", type: "Docker" } }

The GraphQL query was:

    mutation($input: create_flow_from_compressed_string_input!) {
            create_flow_from_compressed_string(input: $input) {
                id
        }
    }

nicholas

03/09/2021, 4:16 PM

Ah I'm sorry @Kieran, that should be

flow.*serialized_hash()*

passed to the idempotency key

Kieran

03/09/2021, 6:04 PM

No worries @nicholas, yes adding that variable avoids the re-registering and cancelled flows from appearing -- thanks for that!

nicholas

03/09/2021, 6:06 PM

Awesome! Thanks for getting back to me 🙂

Kieran

03/11/2021, 9:57 AM

Hey @nicholas I wanted to flag something around the cancelled flows issue. The UI Run History visual is swamped by cancelled flows and isn't showing recent successful ones. for example: The cancelled flows in this visual are from yesterday 5pm when there was a a bunch of new flows registered with actual changes to their tasks. Since yesterday 5pm there have been a bunch of successful flows which I would expect to see here. This visual is now misleading ... any thoughts?

nicholas

03/11/2021, 3:47 PM

Hm yes - I believe those are ordered by when they were intended to start, which means runs that were intended to run in the future but were cancelled end up taking precedence over more recent runs; since it sounds like those cancelled runs are in no way useful to you, I'd suggest using the Python GraphQL client to fetch all those runs whose states are "Cancelled" and delete them with the

delete_flow_run

mutation

Kieran

03/11/2021, 5:20 PM

Is this is an issue I should raise on github @nicholas? Because creating a Flow to periodically make that GraghQL request to clear cancelled Flows seems a bit cyclical..

nicholas

03/11/2021, 5:28 PM

I agree, it's not particularly ideal; there's a sort of interesting problem here where there are times you want to see that data because it's definitely relevant. We have plans to make this graph more of a time-window (for example the last day) instead of the last 100 runs, which I think would solve your use case

Kieran

03/11/2021, 5:52 PM

That would solve it. Or at least allow the UI to filter out certain types. Thanks, I will raise a github issue tomorrow about that.

nicholas

03/11/2021, 5:56 PM

Thank you @Kieran 🙂

5 Views

Open in Slack

Previous Next