Welcome to Scrapy

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Features

Simple
Scrapy was designed with simplicity in mind, by providing the features you need without getting in your way
Productive
Just write the rules to extract the data from web pages and let Scrapy crawl the entire web site for you
Fast
Scrapy is used in production crawlers to completely scrape more than 500 retailer sites daily, all in one server
Extensible
Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core
Portable
Scrapy runs on Linux, Windows, Mac and BSD
Open Source and 100% Python
Scrapy is completely written in Python, which makes it very easy to hack
Well-tested
Scrapy has an extensive test suite with very good code coverage
Batteries included
Scrapy comes with lots of functionality built in. Check this section of the documentation for a list of them.

Still not sure if Scrapy is what you're looking for?. Check out Scrapy at a glance.

Companies using Scrapy

Scrapy is being used in large production environments, to crawl thousands of sites daily. Here is a list of Companies using Scrapy.

Where to start?

Start by reading Scrapy at a glance, then download Scrapy and follow the Tutorial.