Carl
02/25/2021, 10:15 AMupstream_task will fix this because the load(clean) task needs to run after the DataFrame() bit.
clean = impute.map(data_cols, replacement_dict=unmapped({np.nan: 0})) # task
clean = DataFrame(clean, column_names) # function, not a task
load(clean) # taskCarl
02/25/2021, 10:18 AMale
02/25/2021, 10:43 AMclean is the result of a Task, while the second is not.
Since load depends on clean which is not a task result after the second initialization, that’s why you don’t get a dependency between impute and loadCarl
02/25/2021, 10:49 AMDataFrame is a custom Task class, so it should be picked up yeah?
class DataFrame:
"""A utility class to provide convenient syntax for grabbing columns as a Task."""
def __init__(self, cols: "Task", colnames: "Task"):
self.cols = cols
self.colnames = colnames
def __getitem__(self, key: str):
return get(self.cols, self.colnames, key)ale
02/25/2021, 10:55 AMTask I guessale
02/25/2021, 10:56 AMCarl
02/25/2021, 11:03 AMAmanda Wee
02/25/2021, 11:23 AM__getitem__ method means that say, clean["H"] is a taskCarl
02/25/2021, 11:32 AM