Hello everyone, I am trying to determine if Prefec...
# prefect-getting-started
b
Hello everyone, I am trying to determine if Prefect is a good choice for a workflow problem I have, namely simulation pre- and post-processing (and to a certain extent, the actual running of it). Currently, I typically use Python scripts and a rudimentary application I wrote to take in a YAML file containing information about where the data is, how I want it to be transformed, which graphs I want to be made, and perhaps even compiling everything into a PowerPoint presentation. Over a few iterations, I have improved the application's flexibility, but to a certain extent, I feel I am building an application that is not 'core' business. So I started looking online (with some AI help šŸ˜‰) and arrived here. However, I am not certain about the cloud aspects. Is it possible to run Prefect without any remote/cloud interaction? Ideally, I'd like to define tasks and a flow, and run it from the command line, preferably on Windows. Is this a good use case for Prefect? Thanks in advance.
n
hi @Bevan Jones > I feel I am building an application that is not 'core' business sounds like the AI help isn't leading you astray šŸ™‚, this is a great reason to use prefect! > Is it possible to run Prefect without any remote/cloud interaction? yes! depending on what you mean by that, you can: • host a prefect server (i.e. run
prefect server start
someplace) and run your work against this server without depending on our cloud or any remote environment • use prefect cloud for the server, which means that all your work can still happen locally, its just that metadata on its execution is reported to the managed server (this is what we call the hybrid model), then you don't need to worry about managing the server / db does that help?
šŸ™Œ 1
b
Yes it does, thank you. Perhaps I can take more of your time. I want to create a solution usable by others in my team, but some execution details are still unclear. Consider a directory containing several simulation sub-directories. Each simulation may come from a different commercial simulation package. I'll need to: 1. Convert everything into a common representation to avoid confusion. 2. Perform the aforementioned data processing, potentially including custom operations. 3. Place all of this into some output. Currently, my approach is to create a Python 'application' project directory with further Python subdirectories acting as packages for the above tasks. The user runs 'main.py' via CLI with a target YAML file specifying the exact post-processing steps (typically located in the project director containing the simulations). This is advantageous as it separates the 'program' and 'user', allowing post-processing to be represented in a YAML file and making it easily transferable. Converting this to Prefect: 1. Would the best approach be to turn these 'subdirectory packages' into full pip-installable Python packages to import inside the Prefect flow files in simulation project directories? 2. How do I keep 'user' code to a minimum to avoid multiple incompatible workflow versions in the office? Currently, there is a yaml, which must conform to a schema. 3. What directory structure would you recommend? Perhaps: |- simulation directories |- post-processed data? |- workflow |--main_flow.py ? (other subdirectories?) I apologize for the broad question. I hope it makes sense, and thank you for your time.