I am working on standing up Prefect 2.0 is a production environment. For internal data pipeline and reverse etl uses so no fire hazards on my end to use 2.0 early here.
Is there a general preference on YAML vs Code for the deployment specification. I noticed you can configure a flow deployment with YAML but I cant find any information on the schema of that document. For example:
Assuming interval is seconds? Can I specify another grain? Can schedule take a dict? If it takes cron, does that take a dict?
Honestly schedule is the primary question point. Everything else is straightforward enough.
discourse 1
k
Kevin Kho
04/14/2022, 2:33 AM
Hi @Alexander Butler, Iβd need to check with the team tomorrow about this and get back to you.
π 1
a
Anna Geller
04/14/2022, 9:43 AM
Good choice starting with 2.0 directly! π
I'm more biased towards definition in Python, but YAML is also supported. Python definition is friendlier and cleaner.
Here is one example of using YAML:
is much cleaner and easier to understand/change, but YAML is also fine π
a
Alexander Butler
04/14/2022, 2:51 PM
I like Python too. I think the ambiguous bit is whether
schedule
supports cron or different kwargs for interval?
Alexander Butler
04/14/2022, 2:51 PM
or a different time grain
Alexander Butler
04/14/2022, 2:52 PM
in yaml
z
Zanie
04/14/2022, 3:12 PM
The YAML is loaded using Pydantic models which infers the type based on the keys
Zanie
04/14/2022, 3:13 PM
So if you did
cron: string-here
instead of
interval: integer
itβd be loaded as a cron schedule
Zanie
04/14/2022, 3:16 PM
From the Pydantic documentation, you can provide more rich strings for intervals other than seconds
Zanie
04/14/2022, 3:16 PM
Copy code
timedelta fields can be:
timedelta, existing timedelta object
int or float, assumed as seconds
str, following formats work:
[-][DD ][HH:MM]SS[.ffffff]
[Β±]P[DD]DT[HH]H[MM]M[SS]S (ISO 8601 format for timedelta)
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.