Meet Scrapy

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Install latest version:

Scrapy 0.24

pip install scrapy

Sample Scrapy Code

cat > <<EOF

from scrapy import Spider, Item, Field

class Post(Item):
    title = Field()

class BlogSpider(Spider):
    name, start_urls = 'blogspider', ['']

    def parse(self, response):
        return [Post(title=e.extract()) for e in response.css("h2 a::text")]

scrapy runspider

