Wei Mei

    Wei Mei

    4 months ago
    Hi! I have a project that started with flow.py, now I need to create another flow, is it ok that I just create the new flow (db-flow.py) in this same folder?
    Kevin Kho

    Kevin Kho

    4 months ago
    I don’t think the structure of the directory necessarily matters. It would depend on your storage. What is your storage?
    Wei Mei

    Wei Mei

    4 months ago
    Github
    Kevin Kho

    Kevin Kho

    4 months ago
    For Github yeah this won’t affect anything. Just note Github Storage can’t do imports between files. Like if
    db-flow.py
    imports from
    flow.py
    Wei Mei

    Wei Mei

    4 months ago
    awesome
    thanks!
    feel better.
    Kevin Kho

    Kevin Kho

    4 months ago
    Oh lol thanks!
    Guilherme Petris

    Guilherme Petris

    4 months ago
    is there any other way that i could not miss the import functionalities? Would be a great case for modularisation of the code
    Kevin Kho

    Kevin Kho

    4 months ago
    You would need to install it either in the execution environment, or use Docker storage to hold dependencies
    Guilherme Petris

    Guilherme Petris

    4 months ago
    what do you meant by install it in the execution environment? 😁
    This means i can reference a module with utility functions that i use? I wouldn’t like to maintain packages in this occasion because it’s something that it’s tailored for each case/flow that i build
    Kevin Kho

    Kevin Kho

    4 months ago
    Like if you use a local agent, and the agent has access to the files in the PYTHONPATH, it will be able to resolve the imports. For example, the LocalRun takes in a
    working_dir
    . But if you are using something Docker based, then they need to go inside the image.
    Guilherme Petris

    Guilherme Petris

    4 months ago
    Apparently my flows don’t run when i try to use Github as part of my flow.storage. Don’t know exactly what it’s missing here:
    ❯ prefect diagnostics
    {
      "config_overrides": {
        "context": {
          "secrets": false
        }
      },
      "env_vars": [],
      "system_information": {
        "platform": "macOS-12.3.1-x86_64-i386-64bit",
        "prefect_backend": "cloud",
        "prefect_version": "1.2.1",
        "python_version": "3.9.10"
      }
    }
    Code:
    with Flow("zendesk_tickets_incremental") as flow:
    
        latest_date = latest_date_unix()
        incremental = incremental_call(latest_date)
        incremental_df = create_df(incremental)
        upload = upload_to_snowflake(incremental_df)
    
    flow.storage = GitHub(
        repo='X/prefect',
        path='/zendesk/scripts/{FLOW_NAME}.py',
        access_token_secret= Secret('GITHUB_ACCESS_TOKEN').get()  # required with private repositories
    )
    flow.run_config = LocalRun()
    flow.register("zendesk_project_test")
    Kevin Kho

    Kevin Kho

    4 months ago
    As in stuck in scheduled? It is most likely a label issue between Flow and agent. You can find more info here
    Guilherme Petris

    Guilherme Petris

    4 months ago
    I’m not using labels neither in the Agent nor in the flow
    Kevin Kho

    Kevin Kho

    4 months ago
    Is it stuck in scheduled or submitted?
    Wait, the local agent has a default label. Are you sure you have no labels?
    Guilherme Petris

    Guilherme Petris

    4 months ago
    uhm, didn’t knew that the label was actually mandatory to make it run - Let me try to add them here
    Kevin Kho

    Kevin Kho

    4 months ago
    No more like you need to remove the default label
    prefect agent local start --no-hostname-label
    Guilherme Petris

    Guilherme Petris

    4 months ago
    Ok, that seems to work, but now apparently Github is throwing errors back at me even though i already setup the access
    State Message:
    Failed to load and execute flow run: UnknownObjectException(404, {'message': 'Not Found', 'documentation_url': '<https://docs.github.com/rest/reference/repos#get-a-repository>'}
    Kevin Kho

    Kevin Kho

    4 months ago
    Your access token secret should be the name of the secret itself. Not the value. And then it is pulled during runtime
    Guilherme Petris

    Guilherme Petris

    4 months ago
    Yeah, i notice that and corrected the script following:
    access_token_secret= 'GITHUB_ACCESS_TOKEN'
    but that was the error that i’m getting throw back
    Kevin Kho

    Kevin Kho

    4 months ago
    I wouldn’t know then. It looks right. I have a public example here and the path and repo look right to me. Only suggestion I have is maybe add the branch to be explicit. And then the other question is if you host your own Github?
    But yes that error is really not finding the Flow. It also might be faster for you to debug if you do:
    flow.storage = GitHub(
        repo='X/prefect',
        path='/zendesk/scripts/{FLOW_NAME}.py',
        access_token_secret= Secret('GITHUB_ACCESS_TOKEN').get()  # required with private repositories
    )
    flow.storage.add_flow(flow)
    flow.storage.get_flow()
    to see if you can pull. You can avoid re-registration this way and test a lot faster I think
    Guilherme Petris

    Guilherme Petris

    4 months ago
    Yeah, now a simple version of the error:
    github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "<https://docs.github.com/rest/reference/repos#get-repository-content>"}
    It’s a private repo inside an organisation account
    Kevin Kho

    Kevin Kho

    4 months ago
    Yeah this is really not finding it, but I don’t think there’s a lot more I can do to debug that. If you are using the normal github.com, this looks good from what I can tell without peekin gin