Welcome to Scrapy
What is Scrapy?
Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
- Scrapy was designed with simplicity in mind, by providing the features you need without getting in your way
- Just write the rules to extract the data from web pages and let Scrapy crawl the entire web site for you
- Scrapy is used in production crawlers to completely scrape more than 500 retailer sites daily, all in one server
- Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core
- Portable, open-source, 100% Python
- Scrapy is completely written in Python and runs on Linux, Windows, Mac and BSD
- Batteries included
- Scrapy comes with lots of functionality built in. Check this section of the documentation for a list of them.
- Well-documented & well-tested
- Scrapy is extensively documented and has an comprehensive test suite with very good code coverage
- Healthy community
4,000 stars, 1,000 forks, 400 watchers on Github (link)
1,300 followers on Twitter (link)
1,900 questions on StackOverflow (link)
1,800 members, 150 messages per month on mailing list (link)
40-50 users always connected to IRC channel (link)
- Commercial support
- A few companies provide Scrapy consulting and support
Still not sure if Scrapy is what you're looking for?. Check out Scrapy at a glance.
Companies using Scrapy
Scrapy is being used in large production environments, to crawl thousands of sites daily. Here is a list of Companies using Scrapy.