Rasmus Lindqvist
08/02/2022, 7:43 AMscrapy . I have configured scrapy to log to stdout, and configured my Prefect task to log_stdout=True. Should not Prefect then pick up the logs from scrapy or am I missing something?Rob Freedy
08/02/2022, 2:59 PMRasmus Lindqvist
08/02/2022, 3:14 PM@task(
    name=<TASK_NAME>,
    log_stdout=True,
    target=get_prefect_location(
        flow_name="{flow_name}", pipeline_name="{task_name}", interval="daily"
    )
    + "/output.csv",
)
def fetch(output_dir: str) -> pd.DataFrame:Rasmus Lindqvist
08/02/2022, 3:15 PMprint statements and they are visible in the logs in the UIRob Freedy
08/02/2022, 5:19 PMKevin Grismore
08/02/2022, 5:45 PMKevin Grismore
08/02/2022, 5:46 PMKevin Grismore
08/02/2022, 5:48 PMfrom contextlib import redirect_stderr
...
@task(name='run-scrapy')
def run_scrapy(spider: Dict):
    logger = get_run_logger()
    def write(msg: str):
        DEBUG = ' DEBUG: '
        INFO = ' INFO: '
        WARNING = ' WARNING: '
        ERROR = ' ERROR: '
        if msg != '\n':
            if DEBUG in msg:
                logger.debug(msg.split(DEBUG)[1])
            elif INFO in msg:
                <http://logger.info|logger.info>(msg.split(INFO)[1])
            elif WARNING in msg:
                logger.warning(msg.split(WARNING)[1])
            elif ERROR in msg:
                logger.error(msg.split(ERROR)[1])
        else:
            return None
    logger.write = write
    with redirect_stderr(logger):
        configure_logging({'LOG_FORMAT': '%(levelname)s: %(message)s'})
        process = CrawlerRunner(settings=spider['settings'])
        d = process.crawl(spider['name'])
        d.addBoth(lambda _: reactor.stop())
        reactor.run()Kevin Grismore
08/02/2022, 5:50 PMclass MySpider(scrapy.Spider):
    name = 'demo'
    start_urls = ['<https://some.website>']
    def __init__(self, *args, **kwargs):
        logger = logging.getLogger('scrapy')
        logger.setLevel(<http://logging.INFO|logging.INFO>)
        super().__init__(*args, **kwargs)Rasmus Lindqvist
08/03/2022, 7:41 AMDominic Pham
08/12/2022, 10:23 PMRasmus Lindqvist
08/15/2022, 7:14 AM