Carl
02/25/2021, 10:15 AMupstream_task
will fix this because the load(clean)
task needs to run after the DataFrame()
bit.
clean = impute.map(data_cols, replacement_dict=unmapped({np.nan: 0})) # task
clean = DataFrame(clean, column_names) # function, not a task
load(clean) # task
Carl
02/25/2021, 10:18 AMale
02/25/2021, 10:43 AMclean
is the result of a Task, while the second is not.
Since load
depends on clean
which is not a task result after the second initialization, that’s why you don’t get a dependency between impute
and load
Carl
02/25/2021, 10:49 AMDataFrame
is a custom Task class, so it should be picked up yeah?
class DataFrame:
"""A utility class to provide convenient syntax for grabbing columns as a Task."""
def __init__(self, cols: "Task", colnames: "Task"):
self.cols = cols
self.colnames = colnames
def __getitem__(self, key: str):
return get(self.cols, self.colnames, key)
ale
02/25/2021, 10:55 AMTask
I guessale
02/25/2021, 10:56 AMCarl
02/25/2021, 11:03 AMAmanda Wee
02/25/2021, 11:23 AM__getitem__
method means that say, clean["H"]
is a taskCarl
02/25/2021, 11:32 AM