Chris Leber
11/04/2021, 7:13 PMbatch = BatchSubmit(
job_name="test",
job_definition="run_test",
job_queue="test-queue",
boto_kwargs={},
)
with Flow(
"test Batch submit",
storage=S3(bucket="prefect-storage"),
run_config=ECSRun(task_definition=task_definition),
) as flow:
s3_path = Parameter(
"s3_path", default="<s3://data/TEST>"
)
file_suffix = Parameter("file_suffix", default=".txt")
array_size = Parameter("array_size", default=10)
batch.run(
batch_kwargs={
"arrayProperties": {"size": array_size},
"parameters": {"query_s3_path": s3_path, "file_suffix": file_suffix,},
}
)
Arun Giridharan
11/04/2021, 8:13 PMupstream_tasks
to differentiate them?Fina Silva-Santisteban
11/04/2021, 8:28 PMCreate API Key
option in prefect cloud’s Service Account
page. When I use that for flow registration I get this warning:
/opt/hostedtoolcache/Python/3.9.7/x64/lib/python3.9/site-packages/prefect/client/client.py:175: UserWarning: Client was created with an API token configured for authentication. API tokens are deprecated, please use API keys instead.
I set the auth token/key in a python file like this:
client = prefect.Client(
api_token='secret value'
)
client.save_api_token()
The docs only seem to show how to set auth_api_key
when using the prefect cli or a graphql query. Does this mean using the api you’re still using the deprecated token? (I’ve peeked into prefect’s implementation for the Client
class and everything still says api_token
, but that might be a red herring!)Derek Heyman
11/05/2021, 5:07 AMprefect.exceptions.AuthorizationError: [{'path': ['create_flow_run'], 'message': 'AuthenticationError: Forbidden', 'extensions': {'code': 'UNAUTHENTICATED'}}]
Igor Adamski
11/05/2021, 10:11 AMdata_downloader = DataDownloader(b_config=db_config)
downloaded_data = data_downloader.map(tickers_and_dates=grouped_list,
columns=unmapped(columns)
)
save_data = SaveData(db_config=db_config,
override_exists_files=True)
save_data.bind(data=downloaded_data,
table_name='tmp_events')
I’m getting very weird behaviour, where some children of the mapped task start running again after completion. This leads to this weird behaviour that the SaveData
task starts running before all the mapped children finish from the DataDownloader
task. See screenshots attached in the thread, it would be great if someone could offer any guidance on this. On the second screenshot, the top bar is the DataDownloader
bar and you can see that the SaveData
bar (just below) starts before the DataDownloader
one is finishedKlemen Strojan
11/05/2021, 11:25 AMAdam Everington
11/05/2021, 1:49 PMMartin Goldman
11/05/2021, 2:49 PMVamsi Reddy
11/05/2021, 2:51 PMFailed to load and execute Flow's environment: FlowStorageError('An error occurred while unpickling the flow:\n ModuleNotFoundError("No module named \'utils\'")\nThis may be due to a missing Python module in your current environment. Please ensure you have all required flow dependencies installed.')
I am able to run the flow locally but i get the error when i trigger it from the prefect cloud.Dominik Prester
11/05/2021, 5:56 PMMarwan Sarieddine
11/05/2021, 6:49 PMVamsi Reddy
11/05/2021, 9:12 PMitay livni
11/06/2021, 3:03 PMorion
questions/bugs? (2) Is there any flow visualization in orion
? I could not find anything in the docs.
Thanks.Jack Sundberg
11/06/2021, 4:15 PMnruns = FlowRuns.objects.filter(flow__name="MyExampleFlow", state__in=["Running", "Submitted"]).count()
But for graphql... I'm kinda lost. And how would I format this in python for the client.query()
method?itay livni
11/07/2021, 3:21 AMMyClass{Task)...
Greg Adams
11/07/2021, 6:10 PMBastian Röhrig
11/08/2021, 10:26 AMale
11/08/2021, 11:13 AMcreate_flow_run
and wait_for_flow_run
to replace the (legacy?) StartFlowRun
.
The FlowRun doc page warns about this feature being experimental and it may be subject to change in the future.
Would it be possible to know what module parts can be considered stable and what parts are to be considered under review/refactoring?
This would be super helpful for us to understand how to approach future upgrades of our Prefect Core deployment.
cc @Kevin Kho @Zanie @grandimkAqib Fayyaz
11/08/2021, 11:38 AMChris Arderne
11/08/2021, 12:29 PMupstream_tasks
is a bit unclear (to me). The following code:
@task()
def task_a():
...
@task()
def task_b(param):
...
@task()
def task_c():
...
with Flow("example") as flow:
a = task_a()
task_b(a)
task_c(upstream_tasks=[task_b])
produces a DAG as follows:
task_a -> task_b
task_b -> task_c
But I want task_a -> task_b -> task_c
, i.e. task_b
must only run once, doesn't return anything, and must complete before task_c
starts. To achieve this I've just assigned a fake return variable and passed that to upstream_tasks
as follows:
with Flow("example") as flow:
a = task_a()
b = task_b(a)
task_c(upstream_tasks=[b])
Is this the best way to do this?Mike Lev
11/08/2021, 12:39 PMFilip Lindvall
11/08/2021, 1:32 PMMarko Herkaliuk
11/08/2021, 4:58 PMDotan Asselmann
11/08/2021, 5:59 PMBraun Reyes
11/08/2021, 7:29 PMBraun Reyes
11/08/2021, 7:30 PMKhTan
11/08/2021, 7:53 PMDotan Asselmann
11/08/2021, 8:42 PMcharles
11/08/2021, 8:46 PMBelal Aboabdo
11/08/2021, 9:32 PM