Hello, I have reached this point in the prefect d...
# prefect-server
k
Hello, I have reached this point in the prefect documentation and I was wondering where I should be looking to be able to host my dashboard in the cloud instead of from my local server? https://docs.prefect.io/orchestration/server/deploy-local.html#ui-configuration
đź‘‹ 1
k
Hey @Kurt Rhee, as in you want to migrate from Prefect Cloud from Prefect Server? Or you want to host the UI somewhere like an EC2 instance?
k
I currently have my prefect dashboard hoste on localhost 80:80, is it simple to host it at cloud.prefect.io?
k
We don’t host UIs in
<http://cloud.prefect.io|cloud.prefect.io>
, you would create something like an EC2 instance, and then start server there, and then the UI would be
<VM-IP>:8080
.
k
ah gotcha
k
Just wanna clarify, Prefect Server is just the open source version of Prefect Cloud. We host it all for you whereas Prefect Server would be hosted on your own infrastructure.
k
understood
Hey Kevin I've set up a flow and I've also deployed the local host server, but I'm not sure how to connec tthe dots between them
is this something you can help me with?
I tried using prefect agent local start, but I get an prefect.exeptions.authorizationerror no agent api token provided
k
Did you register the flow? and did you do
prefect backend server
?
k
Yes I did prefect backend server, this fixed the authentication problem
I'm not sure what registering the flow is
ah i found the documentation
k
There’s a bunch so just wanna make sure you have the right one
k
awesome thank you!
Kevin you are the most helpful person
k
No problem it’s my job lol
k
welp thanks for being good at your job!
🙏 1
can you help me delete a project?
swear last question for the day
k
Haha no worries, go to Team -> Projects in the UI and you can delete the projects there
k
awesome thanks again!
have a nice rest of your day
đź‘Ť 1
I was wondering do I need to register a flow each time I edit it?
k
For the most part yes, unless you use a script based storage like S3 or Github where there is no serialization. If you dont edit the DAG significantly, this will work but if you add a new task or change a task name, then you need to re-register
k
awesome thank you!
Have you seen this error before?
My flow seems to work fine in python console
but the UI seems to think that it failed
k
Yeah what storage do you use? Do you have one defined?
Or is botany something you import?
k
botany is the top level directory that I am using for my flows
I'm using parquet to store data from the flows, but that is it
k
Are you running this on a different machine than where you registered from?
k
nope same machine
same virtual environment too
k
I see so you when don’t choose the
Storage
class for Prefect, it automatically serializes your file and puts it in the
.prefect
folder. So something with the way you are registering is ruining that path. I think it might be a bit weird that your code is in the
___init___.py
, not sure why you would do that what is happening is the path to the flow in the
.prefect
folder is not working well for you. So here are a couple of things you can do: 1. Use
Local
storage explicitly. There are two arguments you can use. First is the
directory
so you can control where the pickle is saved. Second, you can point it to a script by using `stored_as_script=True`and then supplying a
path_to_file
, which points to the Python file. 2. After this, you can also use the
LocalRun
to specify a
working_dir
so that if your flow as other imports in the same directory, you can choose the directory to start the flow from. How did you register this? Using the
flow.register
in the
init.py
?
I think you can try not doing it in an init file and that might work for you?
Local storage docs and local run docs
k
Interesting, yes I registered the path here:
k
That is not the path. That is the project name. The default
Local()
storage will get this, serialize it, and then save it under your
.prefect
folder. When you run this, can you find the file under
flows
under
.prefect
in the home directory?
k
I tried moving outside of init and got the same issue
I can't find my .prefect folder
k
It’s a hidden folder so maybe it’s just not showing in the UI? How do you register? Do you do
python /path/to/file/__ _init__.py_
or are you in the same folder as the init and it;s just
python __init __.py
?
k
Tried adding this on top of flow.register and it also failed
Copy code
flow.run_config = LocalRun()
looking for hidden folders now
i looked inside of my project directory and virtual enviornment
should i be looking someplace else?
k
.prefect
is in the home directory of your machine
k
as for register it is in the same file as the flow, just under a if name == main statement
I can see the flows inside of .prefect/flows
k
One sec will run some tests
k
okay ty
k
Can you post your whole script?
k
Copy code
import os
import json
import warnings
import pendulum

from datetime import datetime, timedelta

from prefect import Flow, Parameter
from prefect.schedules import IntervalSchedule


class ValentineEtl:
    """
    Valentine ETL Class
    """

    def __init__(self):

        # --- original data period ---
        self.start = '2019-09-11 00:00'
        self.end = datetime.now().strftime('%Y-%m-%d %H:%M')

        # --- get localized time ---
        self.site_code = 'VLT1'
        dirname = os.path.dirname(__file__)
        filename = os.path.join(dirname, fr'../intake/config/{self.site_code}.json')
        try:
            self.config = json.load(open(filename))
        except FileNotFoundError:
            self.config = warnings.warn(
                f"Could not find {filename} "
                "Please utilize botany.intake.geocode "
                "to create a config.json file for this project")



    # --- ghi ---
    from botany.etl.generic_ghi import (
        extract_ghi, extract_ghi_tilt,
        transform_ghi,
        load_ghi, load_ghi_tilt
        )


# --- Schedule ---
schedule = IntervalSchedule(
    start_date=datetime.utcnow() + timedelta(seconds=1),
    interval=timedelta(days=1)
    )

# --- Flow ---
with Flow(
        'Valentine_GHI',
        schedule=schedule) as flow:

    # --- Configurations ---
    dag = ValentineEtl()

    # --- Extracts ---
    # tags
    ghi_tags = [
        'VLT1SMET001_Pyranometer_1_SR30_Irradiance', 'VLT1SMET002_Pyranometer_1_SR30_Irradiance',
        'VLT1SMET003_Pyranometer_1_SR30_Irradiance', 'VLT1SMET004_Pyranometer_1_SR30_Irradiance']
    ghi_tilt_tags = [
        'VLT1SMET001_MVTiltAngle', 'VLT1SMET002_MVTiltAngle',
        'VLT1SMET003_MVTiltAngle', 'VLT1SMET004_MVTiltAngle']

    # extractions
    ghi = dag.extract_ghi(dag, ghi_tags)
    ghi_tilt = dag.extract_ghi_tilt(dag, ghi_tilt_tags)

    # --- Transforms ---
    ghi = dag.transform_ghi(dag, ghi, ghi_tilt)

    # --- Loads ---
    dag.load_ghi(dag, ghi)
    dag.load_ghi_tilt(dag, ghi_tilt)


if __name__ == '__main__':
    flow.register(project_name="APPA")
    flow.run()
    flow.visualize()
    pass
k
This is your error I think:
Copy code
# --- ghi ---
    from botany.etl.generic_ghi import (
        extract_ghi, extract_ghi_tilt,
        transform_ghi,
        load_ghi, load_ghi_tilt
        )
It doesn’t know where to import these from. I suggest you can try
LocalRun(working_dir="dir_above_botany")
to that these imports can work. Is botany Python module or just a collection of scripts?
k
oo nice i'll give that a shot
i actually don't know the difference between a module and a collection of scripots
I think it is a module, there are basically a lot of different functions / classes in there that we use for work
k
A module is actually like
pip installed
so that it can be used in other projects on the same machine that are in different directories
k
ah gotcha
then no it is not a module
k
If you start your local agent in the directory above
botany
, that import will work also. It’s just where the Python process starts, or you could install it as a module to make it available no matter where you run the script. The
pip install
adds the project to the path so it can be imported
k
Copy code
if __name__ == '__main__':
    flow.run_config = LocalRun(working_dir=r'../..')
    flow.register(project_name="APPA")
    flow.run()
    flow.visualize()
    pass
Tried this one, it dind't work
k
I think because that “”../..” is applied relative to where the agent is running from. Not from where the flow is registered
k
ohh
Copy code
if __name__ == '__main__':
    dirname = os.path.dirname(__file__)
    filename = os.path.join(dirname, r'../..')
    flow.run_config = LocalRun(working_dir=filename)
    flow.register(project_name="APPA")
    flow.run()
    flow.visualize()
tried this one that failed too
same module not found error
k
Actually I was wrong. It should be evaluated and then saved so it’s relative to the file path. What does
os.path.dirname(__file__)
give you though when you print it? It gives me nothing
k
os.path.dirname gives me the path to the file
I was able to change the location where i start the agent to the top directory
and how i am getting this error which seems like progress
k
That is progress. I can’t see the error immediately. Does Flow.run() work for you?
I’ll be out of office today, but if you leave messages I can get back to you when I’m on
k
Nice sounds good man, thanks so much for the help so far
flow.run does work within the console
k
Ah if you can post your traceback here I can look
k
Copy code
Flow URL: <http://localhost:8080/default/flow/cea14a36-26b1-4058-85bc-d9b792aafb08>
 └── ID: f88652b7-44c5-4487-b66c-a4f16826c1d8
 └── Project: APPA
 └── Labels: ['sdhqragsolpy02']
[2021-08-06 08:19:06-0700] INFO - prefect.Pendleton_GHI | Waiting for next scheduled run at 2021-08-06T15:19:07.129768+00:00
[2021-08-06 08:19:07-0700] INFO - prefect.FlowRunner | Beginning Flow run for 'Pendleton_GHI'
[2021-08-06 08:19:07-0700] INFO - prefect.TaskRunner | Task 'extract_ghi': Starting task run...
Current Tag:  PND1SMET001_GHI
Current Tag:  PND1SMET002_GHI
[2021-08-06 08:19:08-0700] INFO - prefect.TaskRunner | Task 'extract_ghi': Finished task run for task with final state: 'Success'
[2021-08-06 08:19:09-0700] INFO - prefect.TaskRunner | Task 'extract_ghi_tilt': Starting task run...
Current Tag:  PND1SMET001_GHI_TILT
Current Tag:  PND1SMET002_GHI_TILT
[2021-08-06 08:19:10-0700] INFO - prefect.TaskRunner | Task 'extract_ghi_tilt': Finished task run for task with final state: 'Success'
[2021-08-06 08:19:10-0700] INFO - prefect.TaskRunner | Task 'load_ghi_tilt': Starting task run...
[2021-08-06 08:19:10-0700] INFO - prefect.TaskRunner | Task 'load_ghi_tilt': Finished task run for task with final state: 'Success'
[2021-08-06 08:19:10-0700] INFO - prefect.TaskRunner | Task 'transform_ghi': Starting task run...
[2021-08-06 08:19:10-0700] INFO - prefect.TaskRunner | Task 'transform_ghi': Finished task run for task with final state: 'Success'
[2021-08-06 08:19:10-0700] INFO - prefect.TaskRunner | Task 'load_ghi': Starting task run...
[2021-08-06 08:19:10-0700] INFO - prefect.TaskRunner | Task 'load_ghi': Finished task run for task with final state: 'Success'
[2021-08-06 08:19:10-0700] INFO - prefect.FlowRunner | Flow run SUCCESS: all reference tasks succeeded
[2021-08-06 08:19:10-0700] INFO - prefect.Pendleton_GHI | Waiting for next scheduled run at 2021-08-07T15:19:07.129768+00:00
k
That looks like a success right? Thought you had the error with the write
k
yes it works in console, fails in ui
Here is the UI failed message
k
Oh I think I know. Can you try removing
flow.run()
when you register? You might be running into issues with that. When you register, try having only the registration as the last line of your code.
k
Same error
actually i think i may have got it
amazing
thanks for all of your help Kevin, I was just dumb and was using self as an argument to pass into a function
k
then how did it work with flow.run?
k
no idea