python - Scrapy Override file_path from FilesPipeline -
i want modify output folder of downloaded files , based on source code of files pipeline, file_path can override, tried below code seems didn't work. btw, i'm new on python - scrapy.
pipelines.py
from scrapy.pipelines.files import filespipeline class secfilespipeline(filespipeline): def file_path(self, request, response=none, info=none): ## start of deprecation warning block (can removed in future) def _warn(): scrapy.exceptions import scrapydeprecationwarning import warnings warnings.warn('filespipeline.file_key(url) method deprecated, please use ' 'file_path(request, response=none, info=none) instead', category=scrapydeprecationwarning, stacklevel=1) # check if called file_key url first argument if not isinstance(request, request): _warn() url = request else: url = request.url # detect if file_key() method has been overridden if not hasattr(self.file_key, '_base'): _warn() return self.file_key(url) ## end of deprecation warning block media_guid = hashlib.sha1(to_bytes(url)).hexdigest() # change request.url after deprecation media_ext = os.path.splitext(url)[1] # change request.url after deprecation return 'test/%s%s' % (media_guid, media_ext) settings.py
item_pipelines = { 'myproject.pipelines.secfilespipeline': 2, 'scrapy.pipelines.files.filespipeline': 1, } files_store = '/home/joseph/pdf' expected output: ex. files_store + month + filename.pdf = /home/joseph/pdf/september/filename.pdf
any idea? thank you.
setting files_store value in settings.py should suffice per documentation.
Comments
Post a Comment