python - Why is HttpCacheMiddleware disabled in scrapyd? -
why httpcachedmiddleware need scrapy.cfg , how work around issue?
i use scrapyd-deploy
build egg, , deploy project scrapyd.
when job run, see log output httpcachemiddleware disabled because scrapy.cfg not found.
2014-06-08 18:55:51-0700 [scrapy] warning: disabled httpcachemiddleware: unable find scrapy.cfg file infer project data dir
i check egg file , scrapy.cfg indeed not there because egg file consists of files in project directory. wrong, think egg built correctly.
foo/ |- project/ | |- __init__.py | |- settings.py | |- spiders/ | |- ... |- scrapy.cfg
digging code more, think 1 of 3 if-condition failing somehow in middlewaremanager.
try: mwcls = load_object(clspath) if crawler , hasattr(mwcls, 'from_crawler'): mw = mwcls.from_crawler(crawler) elif hasattr(mwcls, 'from_settings'): mw = mwcls.from_settings(settings) else: mw = mwcls() middlewares.append(mw) except notconfigured, e: if e.args: clsname = clspath.split('.')[-1] log.msg(format="disabled %(clsname)s: %(eargs)s", level=log.warning, clsname=clsname, eargs=e.args[0])
place empty scrapy.cfg
under working directory.
as source code shows, project_data_dir
try find closest scrapy.cfg
, use infer project data dir.
Comments
Post a Comment