Alex de Geofroy
11/16/2023, 5:49 PMflow.from_source
. The child flow consists of a sequence of tasks. The first task is unmapped and outputs a list, which is fed into a sequence of mapped tasks. The mapped tasks all converge on one final unmapped task which collects the upstream results and performs a single action. The flow works perfectly when I run it directly, but when I run it using flow.from_source
, it runs only the first and last unmapped tasks. I was under the impression that it should execute exactly the same way regardless of how it's invoked. Really hoping someone can help me out on this one because I'm totally stumped. Thanks!Nate
11/16/2023, 6:04 PMAlex de Geofroy
11/16/2023, 8:53 PMNate
11/16/2023, 8:54 PMMarvin
11/16/2023, 8:55 PM# Import necessary libraries
import pandas as pd
# Create a simple dataframe
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [5, 15, 10, 20, 15]
})
# Define a function to calculate average
def calculate_average(column):
return column.mean()
# Apply the function to the 'B' column
average = calculate_average(df['B'])
print(average)
This MRE creates a simple dataframe and calculates the average of one of its columns. It's a complete, self-contained piece of code that can be run independently to reproduce the issue (in this case, calculating the average).Nate
11/16/2023, 8:55 PMAlex de Geofroy
11/17/2023, 4:14 AMos.path.dirname(__file__)
as a reference point for reading some files. Simply changing to use os.getcwd()
fixed the problem. It would be nice if the documentation was explicit about the behavior of flow.from_source and that it clones to a temp directory outside of the CWD.Nate
11/17/2023, 6:35 AMNate
11/17/2023, 6:36 AMAlex de Geofroy
11/18/2023, 2:56 AM