scrapy - xpath one line doesn't get me the link -
what want spider engine recognizes link next page.
this page http://quotes.toscrape.com/
i have 2 variants. first 1 css syntax based, works, second 1 (which want xpath version be, doesn't)
next_page_url = response.css('li.next > a::attr(href)').extract_first() //this below not work
next_page_url = response.xpath('/a[contains(@href,"next")]/@href').extract_first() so while can go along css, still curious @ knowing incorrect given xpath syntax makes not give results of css equivalent.
thank you
it goes here:
#follow pagination link next_page_url = response.css('li.next > a::attr(href)').extract_first() if next_page_url: next_page_url = response.urljoin(next_page_url) yield scrapy.request(url=next_page_url,callback=self.parse)
considering provided html target link doesn't contain "next" in @href. try below expression:
next_page_url = response.xpath('/a[contains(text(), "next")]/@href').extract_first() if want exact analogue of css selector:
next_page_url = response.xpath('/li[contains(@class, "next")]/a/@href').extract_first() 
Comments
Post a Comment