Welcome to Scrapy
Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
News
2010-01-31 Follow Scrapy on Twitter!
2009-12-12 Scrapy 0.8 released!. Download it here.
2009-09-17 Scrapy 0.7 released!.
Features
- Simple
- Scrapy was designed with simplicity in mind, by providing the features you need without getting in your way
- Productive
- Just write the rules to extract the data from web pages and let Scrapy crawl the entire web site for you
- Fast
- Scrapy is used in production crawlers to completely scrape more than 500 retailer sites daily, all in one server
- Extensible
- Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core
- Portable
- Scrapy runs on Linux, Windows, Mac and BSD
- Open Source and 100% Python
- Scrapy is completely written in Python, which makes it very easy to hack
- Well-tested
- Scrapy has an extensive test suite with very good code coverage
Still not sure if Scrapy is what you're looking for?. Check out Scrapy at a glance.
Project status
Scrapy is being used in large production environments, to crawl more than one thousand sites daily. For more information see the list of Companies using Scrapy.
Where to start?
Start by reading Scrapy at a glance, then download Scrapy and follow the Tutorial.
