Thanks for all the help so far. Got another questi...
# ask-community
c
Thanks for all the help so far. Got another question! 😄 Hopefully I can explain this okay … Using the feature_engineering.py example as a template. In the flow, there are a number of functions (not tasks) and a class is instantiated too. The last call is a task, but it’s not being attached after the class instantiation. I’m expecting the following three to run in order. Not sure that
upstream_task
will fix this because the
load(clean)
task needs to run after the
DataFrame()
bit.
Copy code
clean = impute.map(data_cols, replacement_dict=unmapped({np.nan: 0})) # task
clean = DataFrame(clean, column_names) # function, not a task
load(clean) # task
Here’s what the flow looks like, and the red shows how I want it to work :)
a
I think the problem is that the first
clean
is the result of a Task, while the second is not. Since
load
depends on
clean
which is not a task result after the second initialization, that’s why you don’t get a dependency between
impute
and
load
c
Yes, that’s correct. However,
DataFrame
is a custom Task class, so it should be picked up yeah?
Copy code
class DataFrame:
    """A utility class to provide convenient syntax for grabbing columns as a Task."""

    def __init__(self, cols: "Task", colnames: "Task"):
        self.cols = cols
        self.colnames = colnames

    def __getitem__(self, key: str):
        return get(self.cols, self.colnames, key)
a
If you want DataFrame to be a custom Task, then you have to extend
Task
I guess
Otherwise Prefect does not know that DataFrame is a task
c
Hmmm, that makes sense. My question then how is it working in the example (line 274)?
a
`DataFrame`is not a task, but its
__getitem__
method means that say,
clean["H"]
is a task
c
@Amanda Wee - ahh that’s starting to make sense now. Thank you for pointing that out. Sorry for all my basic questions 🙂