Hey, I am using Prefect Cloud on version `0.13.18...
# ask-community
k
Hey, I am using Prefect Cloud on version
0.13.18
. Has anyone had this before -- all of a sudden a large row of cancelled flows appear in the UI? (these flows run using the CronSchedule)
n
Hi @Kieran - do you have any more info you could provide? Those flow runs likely have state messages for why they were cancelled.
k
Hi @nicholas, I am a bit lost, as I look into this, none of these flows were triggered by me or the cronSchedule and none of them were cancelled by me either. This seems to have impacted all of our flows. There are no logs either. Here is a screenshot of the last hour (for context there are no scheduled runs in the last hour)
n
Sorry you're running into this @Kieran - do you have any logs associated with those runs that you can share?
For a ID for one of the flows that was impacted would work as well 🙂
k
@nicholas there are no logs returned but here is the Flow ID
Flow 8dda1956-8cb0-4d7a-ad67-477d5da757ad was archived
n
Aha! When a new version of a flow is registered, the old version is archived and any manually-scheduled flow runs will be put in a
Cancelled
state. Is it possible you or someone on your team registered new versions of a bunch of your flows?
k
Yeah, I believe our CI/CD registers our flows on merging in to master but we merge multiple times a day and will only manually trigger a small number of flows (<5). This behaviour of suddenly spawning loads of runs and immediately cancelling them is new (for context we have had this running for a few weeks now).
n
Gotcha - for reference this is the PR that introduced that behavior: https://github.com/PrefectHQ/server/pull/185
Previously, registering a new version of a flow would automatically delete all flow runs in a
Scheduled
state, which is why you probably didn't notice this before
k
I see @nicholas. I guess the way I have put our ci/cd together with Prefect is slightly wrong. On merge into master we register every flow, so to capture any new flows. Using flow.register() and tag it as prod if it is against master. This is called at the bottom of flow files inside a function which checks is_serialisable(). I'm guessing from this thread that we should register once and never again? If so, to build an push the image (we use docker storage and push to ecr) do we just call python foo.py on the flow file without this register step?
Also, is it possible to fix the version of Prefect Cloud being used in the scheduler?
n
Hi @Kieran - not quite; you'll want to re-register your flows anytime the metadata of the flow changes, e.g. parameters, tasks, dependencies, schedules (set in code, not the UI), etc.
To keep your current setup, you can pass the result of
flow.serialize()
to the
idempotency_key
argument to
flow.register()
, which will create a string representation of your flow. If the serialized representation of a flow matches the existing idempotency key, a new version of the flow won't be registered.
k
Thanks @nicholas I will try adding that as at the minute the UI is pretty unusable with all of these cancelled flows?!
n
@Kieran can you explain a bit more? Is it unusable because you can't find the data you need?
k
Sure @nicholas. At the minute we are continually developing our flows and tasks as we tweak and add news things and build more flows. So there are multiple merges into master. At the moment, every time we merge the UI is inundated with 10x grey bars to reflect cancelled future flows. A lot of the UI not only looks really nice (and now it doesn't) but practically returns some basic stats on success of runs -- now this data is dirtied with all of these cancelled runs. It's not the expected behaviour that I thought but your suggestion above may remove all of these cancelled flows, but secondly I don't want the underlying behaviour to change when Prefect roll out a new version -- ideally it would be fixed just like every other packed we use in the project.
n
Gotcha - are you also scheduling runs manually as part of your CI process?
k
i.e.
n
As to the question about pinning the version of Cloud (sorry, I missed that before) - at this time there's no way to pin the version of Cloud, since Core mechanics are completely managed by the version of Prefect (Core) attached to a given flow.
✅ 1
k
No the CI is just to register flow, which builds and publishes the images, The flows (a part from 1 new one which uses paramters) are all controlled by cron schedules.
n
Hm, that sounds like a bug then, let me check with the team to see if there's anything I'm missing; the only runs that should be put in
Cancelled
states are those that aren't created by the Prefect Scheduler
k
Thanks @nicholas
n
Hi @Kieran - whenever you get a chance, could you confirm that your change to the CI script reduces or eliminates the presence of these cancelled runs? I believe we've pushed a fix for this in Cloud in the last few days but I want to be sure this is the same issue.
k
@nicholas would I pass it like this:
Copy code
flow.register(
            project_name=project_name,
            labels=[label],
            add_default_labels=False,
            idempotency_key=flow.serialize()
        )
I get this from CI:
Copy code
prefect.utilities.exceptions.ClientError: 400 Client Error: Bad Request for url: <https://api.prefect.io/graphql>

The following error messages were provided by the GraphQL server:

    INTERNAL_SERVER_ERROR: Variable "$input" got invalid value { name: "customers",
        type: "prefect.core.flow.Flow", schedule: { clocks: [Array], or_filters: [],
        adjustments: [], not_filters: [], filters: [], __version__: "0.14.11", type:
        "Schedule" }, parameters: [], tasks: [[Object], [Object], [Object], [Object],
        [Object], [Object], [Object], [Object], [Object], [Object], ... 4 more items],
        edges: [[Object], [Object], [Object], [Object], [Object], [Object], [Object],
        [Object], [Object], [Object], ... 5 more items], reference_tasks: [],
        environment: null, run_config: { task_role_arn: null, run_task_kwargs: null,
        task_definition: null, cpu: "512", execution_role_arn: null, labels: [], image:
        null, task_definition_path: null, env: null, memory: "1024",
        task_definition_arn: null, __version__: "0.14.11", type: "ECSRun" },
        __version__: "0.14.11", storage: { stored_as_script: false, image_tag: null,
        path: null, prefect_version: "0.14.11", registry_url: "************", secrets: [], flows: {}, image_name: null,
        __version__: "0.14.11", type: "Docker" } } at "input.idempotency_key"; Expected
        type String. String cannot represent a non string value: { name: "customers",
        type: "prefect.core.flow.Flow", schedule: { clocks: [Array], or_filters: [],
        adjustments: [], not_filters: [], filters: [], __version__: "0.14.11", type:
        "Schedule" }, parameters: [], tasks: [[Object], [Object], [Object], [Object],
        [Object], [Object], [Object], [Object], [Object], [Object], ... 4 more items],
        edges: [[Object], [Object], [Object], [Object], [Object], [Object], [Object],
        [Object], [Object], [Object], ... 5 more items], reference_tasks: [],
        environment: null, run_config: { task_role_arn: null, run_task_kwargs: null,
        task_definition: null, cpu: "512", execution_role_arn: null, labels: [], image:
        null, task_definition_path: null, env: null, memory: "1024",
        task_definition_arn: null, __version__: "0.14.11", type: "ECSRun" },
        __version__: "0.14.11", storage: { stored_as_script: false, image_tag: null,
        path: null, prefect_version: "0.14.11", registry_url: "************", secrets: [], flows: {}, image_name: null,
        __version__: "0.14.11", type: "Docker" } }

The GraphQL query was:

    mutation($input: create_flow_from_compressed_string_input!) {
            create_flow_from_compressed_string(input: $input) {
                id
        }
    }
n
Ah I'm sorry @Kieran, that should be
flow.*serialized_hash()*
passed to the idempotency key
k
No worries @nicholas, yes adding that variable avoids the re-registering and cancelled flows from appearing -- thanks for that!
n
Awesome! Thanks for getting back to me 🙂
k
Hey @nicholas I wanted to flag something around the cancelled flows issue. The UI Run History visual is swamped by cancelled flows and isn't showing recent successful ones. for example: The cancelled flows in this visual are from yesterday 5pm when there was a a bunch of new flows registered with actual changes to their tasks. Since yesterday 5pm there have been a bunch of successful flows which I would expect to see here. This visual is now misleading ... any thoughts?
n
Hm yes - I believe those are ordered by when they were intended to start, which means runs that were intended to run in the future but were cancelled end up taking precedence over more recent runs; since it sounds like those cancelled runs are in no way useful to you, I'd suggest using the Python GraphQL client to fetch all those runs whose states are "Cancelled" and delete them with the
delete_flow_run
mutation
k
Is this is an issue I should raise on github @nicholas? Because creating a Flow to periodically make that GraghQL request to clear cancelled Flows seems a bit cyclical..
n
I agree, it's not particularly ideal; there's a sort of interesting problem here where there are times you want to see that data because it's definitely relevant. We have plans to make this graph more of a time-window (for example the last day) instead of the last 100 runs, which I think would solve your use case
k
That would solve it. Or at least allow the UI to filter out certain types. Thanks, I will raise a github issue tomorrow about that.
n
Thank you @Kieran 🙂