Sorry got one more question slightly smiling face looking at Prefect Community #ask-community

Sorry, got one more question :slightly_smiling_fac...

Jacques

04/22/2020, 9:05 PM

Sorry, got one more question 🙂 - looking at the ETL examples where you are doing something like extracting a list of values, mapping them to a transform (map reduce type operation) and then finally using the reduced transform result load that into e.g. a database. Is there a way to have this fan-out instead, in other words not have the reduce function for the map and end with one task. Not sure if that makes sense, so would something like 1 extract task produces 10 results, and that kicks off 10 parallel transforms, each producing one output that is then passed to 10 parallel load tasks be possible?

nicholas

04/22/2020, 9:12 PM

Hi @Jacques definitely! The reduce step on

.map()

is completely optional; if I understand you correctly, you could map again over the results of the transform

.map()

to make 10 separate load tasks.

Jacques

04/22/2020, 9:22 PM

Ok I think I missed an important piece here. If some mapped transforms take 1s and others 5 mins and I do the map over the result of the map then would it need to wait for the last transform to complete before starting the loads?

Alex Cano

04/22/2020, 9:24 PM

@Jacques Currently, yes. However, they’re working on implementing depth first execution (dfe), which would allow for the 1 second task to no longer depend on the 5 minute task to complete.

upvote 1

🎉 1

nicholas

04/22/2020, 9:28 PM

@Jacques as @Alex Cano says, yes, that's something we're actively working on (there's an issue here: https://github.com/PrefectHQ/prefect/issues/2041)

🙏 2

Jacques

04/22/2020, 9:46 PM

Epic, thanks, I'll watch that issue!

😄 1

2 Views

Open in Slack

Previous Next