Sash Stasyk04/27/2022, 1:42 PM
Alvaro Durán Tovar04/29/2022, 9:19 AM
) and register the flows. This allows for easily reuse the same image, dependencies, etc. Now I've started a new project integrating feast and I want to have some flows to materialize the views. That means I guess having 1 flow per thing I want to materialize, then following the same pattern as above but on a different project. Starts to be a bit of too much repetition (although it works well tbh). Is other people doing something similar? How do you deal with distributing the code for the flows? I can only find using docker images because the flows often have dependencies on other python modules.
for flow in flows: flow.storage = docker
Bernardo Galvao04/29/2022, 10:02 AM
John Jacoby05/03/2022, 11:48 PM
Edmondo Porcu05/04/2022, 3:03 AM
however this fails saying that param1 and param2 are not specified. Maybe in reality my_task function should not be decorated with the
def my_task(param1, param2): return NewTaskSomething(param1, param2) and then in the flow... my_task_instance = my_task(param1,param2) my_task_instance(param3)
Edmondo Porcu05/04/2022, 5:49 PM
William Jamir05/04/2022, 7:53 PM
Edmondo Porcu05/05/2022, 1:18 AM
Ramzi A05/06/2022, 12:55 AM
Tyler Matteson05/06/2022, 7:27 PM
). What should I know? What am I not asking? Tasks are a mix of polling, ETL on demand and scheduled ETL tasks. I think most data professionals would describe the load as "not much", so efficiency is going to take back seat to maintainable and easy to use.
Edmondo Porcu05/07/2022, 1:51 AM
in the body of the script. However, since there are parameters that set from env variable in the main like so, that was causing an exception
with Flow(''') as flow:
and now I have wrapped my flow definition in a function. Is that a reasonable thing to do?
if __name__ == '__main__': flow = build_flow() flow.executor = LocalDaskExecutor() project_name = os.environ['PROJECT_NAME'] spark_version = os.environ['SPARK_VERSION'] github_repo = os.environ['GITHUB_REPO'] git_ref = os.environ['GIT_REF'] flow.run( project_name=project_name, spark_version=spark_version, github_repo=github_repo, git_ref=git_ref )
Bernardo Galvao05/09/2022, 11:46 AM
Linh Nguyen05/11/2022, 9:40 AM
Jason White05/11/2022, 6:23 PM
Chris Hatton05/11/2022, 9:40 PM
Marco Pérez05/12/2022, 12:10 AM
Yang Ruan05/13/2022, 9:52 PM
Sander05/16/2022, 9:24 AM
Davide at Evo05/16/2022, 2:26 PM
Yang Ruan05/16/2022, 5:19 PM
Stéphan Taljaard05/17/2022, 3:03 PM
Linh Nguyen05/18/2022, 7:38 AM
task to let a master flow wait for another master flow? Or we need to create another master flow on top of previous flows • I would imagine these master flows will expand and become more complex. Also child flows inside might be overlapped e.g one extract flow is needed in two master flows. What would you recommend regarding this? Thanks
John Kang05/18/2022, 3:25 PM
Validation failed! Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or
explicitly create credentials and re-run the application. For more information, please see
John Kang05/18/2022, 6:10 PM
but when I try to
prefect deployment ls
it does not show up. Also, when I try to run the deployment locally it does not work as well. FYI, my remote storage is through Google Cloud.
prefect deployment inspect 'leonardo_dicapriflow/leonardo-deployment'
C\ProgramData\Anaconda3\envs\Capacity venv\lib\site packages\pkg resources\ init .py122: PkgResourcesDeprecationWarning: winpty is an invalid version and will not be supported in a future release warnings.warn( Deployments ┌──────────────────────────────────────────┬──────────────────────────────────────┐ │ Name │ ID │ ├──────────────────────────────────────────┼──────────────────────────────────────┤ │ leonardo_dicapriflow/leonardo-deployment │ 19aacccb-d89e-406e-bd1a-0ba4bf2dedb5 │ └──────────────────────────────────────────┴──────────────────────────────────────┘
(Capacity_venv) C:\Users\JKANG1\PycharmProjects\Manheim_Capacity\main_python_files\cockroachdb_write_after_etl>prefect deployment ls
C\ProgramData\Anaconda3\envs\Capacity venv\lib\site packages\pkg resources\ init .py122: PkgResourcesDeprecationWarning: winpty is an invalid version and will not be supported in a future release warnings.warn( Deployment "'leonardo_dicapriflow/leonardo-deployment'" not found!
(Capacity_venv) C:\Users\JKANG1\PycharmProjects\Manheim_Capacity\main_python_files\cockroachdb_write_after_etl>prefect deployment inspect 'leonardo_dicapriflow/leonardo-deployment'
jedi05/18/2022, 6:57 PM
jedi05/18/2022, 6:59 PM
Aaron Goebel05/18/2022, 8:08 PM