Extract data from web pages with simple Python programming
Build a threaded crawler to process web pages in parallel
Follow links to crawl a website
Download cache to reduce bandwidth
Use multiple threads and processes to scrape faster
Learn how to parse JavaScriptdependent websites
Interact with forms and sessions
Solve CAPTCHAs on protected web pages
Discover how to track the state of a crawl
Who this book is for
This book is aimed at developers who want to use web scraping for legitimate purposes. Prior programming experience with Python would be useful but not essential. Anyone with general knowledge of programming languages should be able to pick up the book and understand the principals involved.