I'm struggling to get local+docker 'as script' flo...
# ask-community
w
I'm struggling to get local+docker 'as script' flows to work, and I can't find a working value for 'script path'
Copy code
prefect-job-0435a2e4-97dqj › flow
prefect-job-0435a2e4-97dqj flow No module named 'addemart/src/addemart/__main__'
prefect-job-0435a2e4-97dqj flow Traceback (most recent call last):
prefect-job-0435a2e4-97dqj flow   File "/usr/local/bin/prefect", line 8, in <module>
prefect-job-0435a2e4-97dqj flow     sys.exit(cli())
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
prefect-job-0435a2e4-97dqj flow     return self.main(*args, **kwargs)
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
prefect-job-0435a2e4-97dqj flow     rv = self.invoke(ctx)
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
prefect-job-0435a2e4-97dqj flow     return _process_result(sub_ctx.command.invoke(sub_ctx))
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
prefect-job-0435a2e4-97dqj flow     return _process_result(sub_ctx.command.invoke(sub_ctx))
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
prefect-job-0435a2e4-97dqj flow     return ctx.invoke(self.callback, **ctx.params)
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
prefect-job-0435a2e4-97dqj flow     return callback(*args, **kwargs)
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/prefect/cli/execute.py", line 96, in flow_run
prefect-job-0435a2e4-97dqj flow     raise exc
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/prefect/cli/execute.py", line 73, in flow_run
prefect-job-0435a2e4-97dqj flow     flow = storage.get_flow(flow_data.name)
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/prefect/storage/local.py", line 103, in get_flow
prefect-job-0435a2e4-97dqj flow     module_str=flow_location, flow_name=flow_name
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/site-packages/prefect/utilities/storage.py", line 126, in extract_flow_from_module
prefect-job-0435a2e4-97dqj flow     module = importlib.import_module(mod_name)
prefect-job-0435a2e4-97dqj flow   File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module
prefect-job-0435a2e4-97dqj flow     return _bootstrap._gcd_import(name[level:], package, level)
prefect-job-0435a2e4-97dqj flow   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
prefect-job-0435a2e4-97dqj flow   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
prefect-job-0435a2e4-97dqj flow   File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
prefect-job-0435a2e4-97dqj flow   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
prefect-job-0435a2e4-97dqj flow   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
prefect-job-0435a2e4-97dqj flow   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
prefect-job-0435a2e4-97dqj flow   File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
prefect-job-0435a2e4-97dqj flow ModuleNotFoundError: No module named 'addemart/src/addemart/__main__'
This results from:
Copy code
return Local(
        add_default_labels=False,
        stored_as_script=True,
    path="addemart/src/addemart/__main__.py",
    )
If I change it to the absolute path to the installed binstub, I get:
Copy code
ModuleNotFoundError: No module named '/usr/local/bin/addemart'
Copy code
return Local(
        add_default_labels=False,
        stored_as_script=True,
        path="/usr/local/bin/addemart",
    )
k
What is the abspath of the file in the Docker container?
w
/usr/local/bin/addemart
though it gets installed from a directory under
/opt
That second path is versioned, like
/opt/addemart-0.2.0.1.dev67-g39cbba83/
But it's like it is treating
path=
as a python module name?
k
This is still with KubernetesRun right? Just wondering if any of the paths will work for a DockerRun because I am sure Local Storage + DockerRun works, but less sure about LocalStorage + Kubernetes though, so I can’t see why it wouldn’t work if it’s in the container. The module named with the path is because Prefect will import the flow from that location. You normally don’t see this log unless finding the flow failed.
I think I’ve shown you my repo with LocalStorage + DockerRun right? Just making sure
Not sure what you mean with the second path is versioned. Are you copying a wheel in and then installing?
w
Yeah, I saw that repo and was just studying it again to see if it made me understand this better
Versioned in the sense that here's a snippet of the Dockerfile for it:
Copy code
WORKDIR /opt/addemart-${ADDEMART_VERSION}
COPY .  /opt/addemart-${ADDEMART_VERSION}/
RUN ADDEMART_VERSION=${ADDEMART_VERSION} pip install --use-feature=in-tree-build .
DockerRun will create another container out of my working directory, right? I'm trying to avoid that because the image already has a full copy
k
Yes it will. Maybe check your wheel file just see if everything is getting copied as expected?
w
Am I always building a wheel when I pip install a git clone? Hmm.
Aah, found it
Looks like it has everything it needs; all the Prefect stuff happens in
___main___.py
Copy code
Archive:  /root/.cache/pip/wheels/21/b0/e2/4c506faf687e17077ccb14e9eac1fcfa7ea1a2f2bbda856e3f/addemart-0.2.0.1.dev67_g39cbba83-py3-none-any.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
       90  2021-05-19 20:52   addemart/__init__.py
    12012  2021-08-31 03:15   addemart/__main__.py
     8006  2021-08-30 19:55   addemart/cli.py
      737  2021-08-30 19:55   addemart/constants.py
    11466  2021-08-30 19:55   addemart/dateint.py
     6019  2021-08-31 02:54   addemart/db.py
    26635  2021-08-31 02:54   addemart/numpylib.py
      542  2021-08-30 19:52   addemart/numpytypes.py
        0  2021-05-19 20:52   addemart/py.typed
    12422  2021-08-31 02:54   addemart/util.py
       15  2021-05-19 20:52   addemart/dag/__init__.py
    26032  2021-08-30 19:55   addemart/dag/graph.py
      974  2021-05-19 20:52   addemart/dag/log.py
    22771  2021-08-30 19:55   addemart/dag/parse.py
        0  2021-06-17 16:30   addemart/transformlib/__init__.py
     2510  2021-08-30 19:55   addemart/transformlib/holding_interval.py
    42915  2021-08-30 19:55   addemart/transformlib/valuation.py
      580  2021-08-31 16:07   addemart-0.2.0.1.dev67_g39cbba83.dist-info/METADATA
       92  2021-08-31 16:07   addemart-0.2.0.1.dev67_g39cbba83.dist-info/WHEEL
       53  2021-08-31 16:07   addemart-0.2.0.1.dev67_g39cbba83.dist-info/entry_points.txt
        9  2021-08-31 16:07   addemart-0.2.0.1.dev67_g39cbba83.dist-info/top_level.txt
     1826  2021-08-31 16:07   addemart-0.2.0.1.dev67_g39cbba83.dist-info/RECORD
---------                     -------
   175706                     22 files
Everything works fine when I just
docker run
it
This doesn't happen during the
KubernetesRun
phase, does it? Because I'm back to using
prefecthq/prefect
for that
k
I assumed you tried the full thing like
"/usr/local/bin/addemart/src/addemart/__main__.py"
?
w
That's not an actual path that exists though
/usr/local/bin/addemart
is the stub that
pip install
created
k
I see. What would the actual abspath to the file be in the container? Or is it inside that wheel?
w
(I get a different error if I try to configure
KubernetesRun
with my image.. We could try to debug that instead I guess)
Copy code
# cat /usr/local/bin/addemart 
#!/usr/local/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'addemart===0.2.0.1.dev67-g39cbba83','console_scripts','addemart'
import re
import sys

# for compatibility with easy_install; see #2198
__requires__ = 'addemart===0.2.0.1.dev67-g39cbba83'

try:
    from importlib.metadata import distribution
except ImportError:
    try:
        from importlib_metadata import distribution
    except ImportError:
        from pkg_resources import load_entry_point


def importlib_load_entry_point(spec, group, name):
    dist_name, _, _ = spec.partition('==')
    matches = (
        entry_point
        for entry_point in distribution(dist_name).entry_points
        if entry_point.group == group and entry_point.name == name
    )
    return next(matches).load()


globals().setdefault('load_entry_point', importlib_load_entry_point)


if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(load_entry_point('addemart===0.2.0.1.dev67-g39cbba83', 'console_scripts', 'addemart')())
So I guess the way it works is that the lib gets installed to the Python library path
and then the script loads it in a versioned way from there
So
/usr/local/lib/python3.9/site-packages/addemart
is where the real stuff is installed
Oh interesting.. I tried path= that +
__main____.py
And now I get
No module named '/usr/local/lib/python3'
Copy code
prefect-job-c786e787-64j4d flow No module named '/usr/local/lib/python3'
prefect-job-c786e787-64j4d flow Traceback (most recent call last):
prefect-job-c786e787-64j4d flow   File "/usr/local/bin/prefect", line 8, in <module>
prefect-job-c786e787-64j4d flow     sys.exit(cli())
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
prefect-job-c786e787-64j4d flow     return self.main(*args, **kwargs)
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
prefect-job-c786e787-64j4d flow     rv = self.invoke(ctx)
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
prefect-job-c786e787-64j4d flow     return _process_result(sub_ctx.command.invoke(sub_ctx))
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
prefect-job-c786e787-64j4d flow     return _process_result(sub_ctx.command.invoke(sub_ctx))
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
prefect-job-c786e787-64j4d flow     return ctx.invoke(self.callback, **ctx.params)
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
prefect-job-c786e787-64j4d flow     return callback(*args, **kwargs)
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/prefect/cli/execute.py", line 96, in flow_run
prefect-job-c786e787-64j4d flow     raise exc
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/prefect/cli/execute.py", line 73, in flow_run
prefect-job-c786e787-64j4d flow     flow = storage.get_flow(flow_data.name)
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/prefect/storage/local.py", line 103, in get_flow
prefect-job-c786e787-64j4d flow     module_str=flow_location, flow_name=flow_name
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/site-packages/prefect/utilities/storage.py", line 126, in extract_flow_from_module
prefect-job-c786e787-64j4d flow     module = importlib.import_module(mod_name)
prefect-job-c786e787-64j4d flow   File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module
prefect-job-c786e787-64j4d flow     return _bootstrap._gcd_import(name[level:], package, level)
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
prefect-job-c786e787-64j4d flow   File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
prefect-job-c786e787-64j4d flow ModuleNotFoundError: No module named '/usr/local/lib/python3'
k
That seems like it’s getting cut because of the
.
because Python treats the stuff after like a submodule right?
I am wondering if you can rip out the
___main___.py
and place it somewhere Prefect can reach it easily and point local storage to that? So that it’s not buried inside the python packages?
w
Let me try specifying the path to the source explicitly instead of using the installed version
That gives me
ModuleNotFoundError: No module named '/opt/addemart-0'
k
Oh then it’s the version cutting it off. Your version is probably like 0.1 or 0.2. The
.
is being misinterpreted by python
w
Ok maybe I can come at it from a different angle. Is it possible that the whole time, this hasn't been running my image, but rather the prefect default one?
I'm still really hazy on which step is image-configured by KubernetesRun
Which container ends up actually needing to process this flow? The final worker, right?
k
Hard to tell, but highly doubt as long as you have
KubernetesRun(image=…)
, and I believe you went into the pod and looked and saw the expected files right?
w
No, the pod exits too quickly for me to get into it.. I've been `docker run`'ing the image I've got configured
So KubernetesRun needs to be my image, not just the Executor?
I'm unclear on why my app's files are needed in both places
When I use my image for the KubernetesRun config as well as the Executor, I get this error:
Copy code
prefect-job-9fbfd07d-pc7gz flow usage: addemart [-h] [-V] [-v] [-q] {register-flow,run-flow,run-stage} ...
prefect-job-9fbfd07d-pc7gz flow addemart: error: argument command: invalid choice: 'prefect' (choose from 'register-flow', 'run-flow', 'run-stage')
Which indicates that it's trying to pass
prefect blah blah
as an argument to the entry point of the container
Which should work fine, because my container has:
Copy code
ENTRYPOINT ["/bin/sh", "-c"]
CMD ["addemart"]
k
I think this will work if you copy
___main.py___
over to an easily accessible abspath that does not contain a
.
and point Local storage to that. Yes I believe you don’t need it on the executor because the flow is loaded by then. Of course, you would not need to fiddle with paths also if you use GithubStorage or S3 Storage. The thing that’s happening is that Prefect just needs to be able to pull the file from somewhere. In the case of local storage, it comes in the from of the
import …
where
is the location of your file, then it pulls the flow from the file. What’s happening is that the paths where the file lives have a
.
, which messes up with the python import. So importing from
usr/python3.9
gets broken up by python’s import to be the submodule 9 under the module
usr/python3
. It doesn’t treat it as a full path. So I think the use Local storage, you can copy the main file to
usr/something_simple
and point local storage to that and it’ll be able to pull it.
So it’s not that the
app
necessarily needs to be in 2 places, it just needs to live in an abspath python can access properly (nothing with
.
)
Did see that error from earlier today. Let me follow up on that.