Hi all, does anyone have experience importing and ...
# ask-community
d
Hi all, does anyone have experience importing and calling scrapy API scripts in Prefect? I am running into a
Copy code
KeyError: 'Spider not found: 'scraper'
when I try to run debug my program. However when I try to define the function the task is calling manually, it works as intended.
n
Hi @Dominic Pham - can you give a little more info about how you’re attempting your run your flow? For example, are you running this in docker?
d
Hi @nicholas, I am running the flow locally via pressing F5 in a .py file in VSCode. My current project structure looks like this:
Copy code
-Project
  |--outputs
  |--scraper
     |--spiders
  |--flow.py
  |--scrapy_api_script.py
  |--scrapy.cfg
Here is a snippet of the flow.py code that I have written
Copy code
from scraper_api import instantiate_web_scraper
@task
def scrapy_api_call(job_list:list):
    # Call the scrapy API file for the scraper. Using job titles from the db output as inputs for the API call.
    for title in job_list[0]:
        return scraper_api.instantiate_web_scraper(title)
    

def prefect_flow():
    with Flow("Pipeline") as flow:
        db_pull = connection()
        run_scraper = scrapy_api_call(db_pull)
        flow.run()
        
if __name__=="__main__":
    prefect_flow()
n
Can you try moving
flow.run()
outside of the flow context? i.e.:
Copy code
def prefect_flow():
  with Flow("Pipeline") as flow:
    # .. 

  flow.run()
d
Still encountering issues. The error seems to occur in the API scrip
Copy code
KeyError 'Spider not found: scraper'
File "C:\Users\x\x\Documents\Prefect Projects\Prefect-Projects\scraper\scraper_api.py", line 6, in instantiate_web_scraper
    process.crawl("indeed_scraper", title=job_title)
  File "C:\Users\x\x\Documents\Prefect Projects\Prefect-Projects\Cscraper\01_instantiate_scraper.py", line 52, in scrapy_api_call (Current frame)
    return instantiate_web_scraper(title)
  File "C:\Users\x\x\Documents\Prefect Projects\Prefect-Projects\scraper\01_instantiate_scraper.py", line 60, in prefect_flow
    flow.run()
  File "C:\Users\x\x\Documents\Prefect Projects\Prefect-Projects\scraper\01_instantiate_scraper.py", line 63, in <module>
    prefect_flow()
n
Hm, you’ve got an import issue somewhere; one thing I noticed is that you’re not importing
scraper_api
, you’re importing
instantiate_web_scraper
but still calling
scraper_api.instantiate_web_scraper
d
Ah yes I've tried importing just scraper_api, as it only has one function (because it is a script that I wrote) but during execution VSCode will open the script file and direct me to the line where the error is occurring.