jake lee
10/12/2021, 2:19 AMlog_stdout=True, name="<task name>", state_handlers=[state_handler])
and wanted to have common parameter setted up to just call them. Is there any sample where such thing was done? or is there any way that I can set s common parameter that are being used in my company flows?Ryan Sattler
10/12/2021, 5:35 AMDarren Fleetwood
10/12/2021, 7:26 AMThomas Furmston
10/12/2021, 8:16 AMThomas Furmston
10/12/2021, 8:17 AMThomas Furmston
10/12/2021, 8:17 AMDockerRun
class and using LocalStorage
. I had this running on a DockerAgent just fine.Thomas Furmston
10/12/2021, 8:17 AMDockerStorage
, which again I got to work. I was just wondering though, does the use of the DockerRun
class do anything when I am using DockerStorage
? At first glance, it would seem not as the use of DockerStorage
would imply the run would run on docker anyway. Am I missing something?Thomas Furmston
10/12/2021, 8:23 AMThomas Furmston
10/12/2021, 8:23 AM张强
10/12/2021, 11:16 AMsubflows
in the parent flow
?Ruslan Aliev
10/12/2021, 1:25 PMTony Yun
10/12/2021, 4:08 PMchicago-joe
10/12/2021, 4:28 PMValueError: Flow could not be deserialized successfully. Error was: ValidationError({'schedule': {'clocks': {0: {'parameter_defaults': defaultdict(<class 'dict'>, {'accnt': {'value': ['Field may not be null.']}, ....
std_inputs = { 'report':False,
'accnt': None, # string - delimited list
'strategy': None, # string - delimited list
'institutionalAdvisor':None, # string - delimited list
'endDate': None, # string - 'YYYY-MM-DD'
'shadow': None, # SPY, treasury
'benchmark': None, # SPY, PUT, BXM
'fees': False, # Bool
'output': 'daily' # monthly, daily
}
with Flow('Update Performance tables',
schedule = Schedule(
clocks = [
CronClock("7 6 * * 1-5",
start_date=pendulum.now("America/Chicago"),
parameter_defaults = std_inputs)]...
Any help would be greatly appreciated!Nick Hart
10/12/2021, 5:40 PMHugo Slepicka
10/12/2021, 6:32 PMflow.run()
, instead of it running on the registered server with the local agent it is running on my notebook.
I could not find on the docs how to trigger the execution of the registered flow using the local server. Could you point me to an example or docs (which I likely missed)?
What I am looking for is a way to execute the registered flow on my server via Python.Adam Brusselback
10/12/2021, 6:45 PMDominic Pham
10/13/2021, 1:00 AMKeyError: 'Spider not found: 'scraper'
when I try to run debug my program. However when I try to define the function the task is calling manually, it works as intended.Sean Talia
10/13/2021, 1:51 AMnumpy
, pandas
, snowflake-connector-python
). Then our users will go and write a handful of their own custom Python classes and modules that they need for their flow. In order to make these custom modules available for use in their flows, people have been creating slight variations of the same docker image that have that same set of python packages installed in it, and then just COPY
their project's code into the image – at that point, their RunConfig image has everything they need in it to run their flow.
One of the issues I'm foreseeing with this approach is that it's going to lead to a lot of image bloat in terms of the number of images we'll have in use across our flows – images whose Dockerfiles might be found across several different repositories – so we'll be maintaining a lot of images that hardly differ from one another save for a handful of custom Python modules that people copy into them. I'm trying to see if there's an approach that avoids this – or at least avoids it in a way that has a favorable tradeoff. Maybe instead of these custom modules needing to be available at registration/build time, they can simply be retrieved at runtime from S3, for example? If that were possible, the management overhead now moves to S3 rather than our image repository, but I think that's easier to deal with; plus many of our users who need/want to build these flows don't necessarily want to be in the business in building and managing Docker images.Martin Durkac
10/13/2021, 11:54 AMMadison Schott
10/13/2021, 2:54 PMMadison Schott
10/13/2021, 3:53 PMError during execution of task: HTTPError('404 Client Error: Not Found for url: <https://api.fivetran.com/v1/connectors/%0Alethal_conservation>')
Lon Nix
10/13/2021, 4:31 PMGreg Adams
10/13/2021, 4:33 PMek
10/13/2021, 5:36 PMNAMESPACE=prefect-server
VERSION=2021.09.02
kubectl create namespace $NAMESPACE
helm install -n $NAMESPACE $NAMESPACE \
--version $VERSION \
--values ./helm/values.yaml \
prefecthq/prefect-server
Here is my helm values.yml
postgresql:
postgresqlDatabase: postgres
postgresqlUsername: postgres
existingSecret: [secret]
servicePort: 5432
externalHostname: "x.x.x.x"
useSubChart: false
Is there anything else I'm missing to set in my value.yml? much appreciate it!Sergey Shamsuyarov
10/13/2021, 7:30 PMdef gen_prefect_cronclock(config, report_conf_path):
clocks = []
for report, cron in get_all_reports_cron_dict(report_conf_path).items():
for tz, ppk_name_list in get_tz_ppk_dict(config).items():
clocks.append(CronClock(cron,
start_date=pendulum.datetime(
1970, 1, 1, tz=tz),
parameter_defaults={
'report_path': report, 'ppk_name_list': ppk_name_list}
)
)
return clocks
run flow
with Flow("main-report-flow", main_schedule) as main_flow:
conf = Parameter('config', default=config)
report_config_path = Parameter('report_config_path',
default=REPORT_CONF_PATH)
report_path = Parameter('report_path', default="")
ppk_name_list = Parameter('ppk_name_list', default=[])
tasck_execute_send_report.map(unmapped(conf),
unmapped(report_config_path),
unmapped(report_path),
ppk_name_list)
Kyle McChesney
10/13/2021, 8:23 PMfrom settings import JOB_QUEUE, JOB_DEF
@task
def trigger_batch_job(input):
...
return BatchSubmit(
job_name=job_name,
job_definition=JOB_DEF,
job_queue=JOB_QUEUE,
batch_kwargs=...,
).run()
with Flow(...):
inputs = some_task()
wait_res = AWSClientWait(
client='batch',
waiter_name='JobComplete',
)(
waiter_kwargs={
'jobs': [
trigger_batch_job(inputs),
],
'WaiterConfig': {
'Delay': 10,
'MaxAttempts': 10000,
},
},
)
And a settings.py file that looks like:
# settings.py
import os
JOB_DEF = os.environ.get(
'PREFECT_FLOWS_JOB_DEF',
'job-latest',
)
JOB_QUEUE = os.environ.get(
'PREFECT_FLOWS_JOB_QUEUE',
'job-gpu-dev',
)
The idea here is to have dynamic values for these settings based on what environment this is running in (we have dev/test/prod AWS accounts)Kyle McChesney
10/13/2021, 8:24 PMKyle McChesney
10/13/2021, 8:25 PMHugo Slepicka
10/13/2021, 9:28 PMOwen McMahon
10/13/2021, 9:30 PMhost_config
arg of DockerRun
, however looking at the Docker Python SDK docs, it appears the ports for forwarding have to additionally be passed into the create_container()
function (https://docker-py.readthedocs.io/en/stable/api.html#docker.api.container.ContainerApiMixin.create_container), which looking at the prefect source code, is not an additional arg I can pass into the run configs of DockerRun
to thus be sent into create_container()
. So, doesn't seem possible.
This feels like something pretty common and doable, and I'm guessing I'm just looking at it the wrong direction. Has anyone accomplish this before and have any suggestions? Thanks!