Kevin Kho
Anna Geller
Sash Stasyk
04/27/2022, 1:42 PMmap
call?chicago-joe
04/27/2022, 7:13 PMAlvaro DurΓ‘n Tovar
04/29/2022, 9:19 AMfor flow in flows: flow.storage = docker
) and register the flows. This allows for easily reuse the same image, dependencies, etc.
Now I've started a new project integrating feast and I want to have some flows to materialize the views. That means I guess having 1 flow per thing I want to materialize, then following the same pattern as above but on a different project. Starts to be a bit of too much repetition (although it works well tbh).
Is other people doing something similar? How do you deal with distributing the code for the flows? I can only find using docker images because the flows often have dependencies on other python modules.Bernardo Galvao
04/29/2022, 10:02 AMJohn Jacoby
05/03/2022, 11:48 PMEdmondo Porcu
05/04/2022, 3:03 AMdef my_task(param1, param2):
return NewTaskSomething(param1, param2)
and then in the flow...
my_task_instance = my_task(param1,param2)
my_task_instance(param3)
however this fails saying that param1 and param2 are not specified. Maybe in reality my_task function should not be decorated with the @mytask
decorator?Edmondo Porcu
05/04/2022, 5:49 PMWilliam Jamir
05/04/2022, 7:53 PMEdmondo Porcu
05/05/2022, 1:18 AMRamzi A
05/06/2022, 12:55 AMTyler Matteson
05/06/2022, 7:27 PMsubprocess
). What should I know? What am I not asking? Tasks are a mix of polling, ETL on demand and scheduled ETL tasks. I think most data professionals would describe the load as "not much", so efficiency is going to take back seat to maintainable and easy to use.Edmondo Porcu
05/07/2022, 1:51 AMwith Flow(''') as flow:
in the body of the script. However, since there are parameters that set from env variable in the main like so, that was causing an exception
if __name__ == '__main__':
flow = build_flow()
flow.executor = LocalDaskExecutor()
project_name = os.environ['PROJECT_NAME']
spark_version = os.environ['SPARK_VERSION']
github_repo = os.environ['GITHUB_REPO']
git_ref = os.environ['GIT_REF']
flow.run(
project_name=project_name,
spark_version=spark_version,
github_repo=github_repo,
git_ref=git_ref
)
and now I have wrapped my flow definition in a function. Is that a reasonable thing to do?Bernardo Galvao
05/09/2022, 11:46 AMLinh Nguyen
05/11/2022, 9:40 AMJason White
05/11/2022, 6:23 PMChris Hatton
05/11/2022, 9:40 PMMarco PΓ©rez
05/12/2022, 12:10 AMYang Ruan
05/13/2022, 9:52 PMSander
05/16/2022, 9:24 AMDavide at Evo
05/16/2022, 2:26 PMYang Ruan
05/16/2022, 5:19 PMStΓ©phan Taljaard
05/17/2022, 3:03 PMLinh Nguyen
05/18/2022, 7:38 AMExternalTaskSensor
task to let a master flow wait for another master flow? Or we need to create another master flow on top of previous flows
β’ I would imagine these master flows will expand and become more complex. Also child flows inside might be overlapped e.g one extract flow is needed in two master flows. What would you recommend regarding this?
ThanksJohn Kang
05/18/2022, 3:25 PMValidation failed! Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or
explicitly create credentials and re-run the application. For more information, please see
<https://cloud.google.com/docs/authentication/getting-started>
John Kang
05/18/2022, 6:10 PMprefect deployment ls
but when I try to prefect deployment inspect 'leonardo_dicapriflow/leonardo-deployment'
it does not show up. Also, when I try to run the deployment locally it does not work as well. FYI, my remote storage is through Google Cloud.
(Capacity_venv) C:\Users\JKANG1\PycharmProjects\Manheim_Capacity\main_python_files\cockroachdb_write_after_etl>prefect deployment ls
C\ProgramData\Anaconda3\envs\Capacity venv\lib\site packages\pkg resources\ init .py122: PkgResourcesDeprecationWarning: winpty is an invalid version and will not be supported in a future release
warnings.warn(
Deployments
ββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β Name β ID β
ββββββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ€
β leonardo_dicapriflow/leonardo-deployment β 19aacccb-d89e-406e-bd1a-0ba4bf2dedb5 β
ββββββββββββββββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββ
(Capacity_venv) C:\Users\JKANG1\PycharmProjects\Manheim_Capacity\main_python_files\cockroachdb_write_after_etl>prefect deployment inspect 'leonardo_dicapriflow/leonardo-deployment'
C\ProgramData\Anaconda3\envs\Capacity venv\lib\site packages\pkg resources\ init .py122: PkgResourcesDeprecationWarning: winpty is an invalid version and will not be supported in a future release
warnings.warn(
Deployment "'leonardo_dicapriflow/leonardo-deployment'" not found!jedi
05/18/2022, 6:57 PMjedi
05/18/2022, 6:59 PMAaron Goebel
05/18/2022, 8:08 PM