为什么要写爬虫?
为什么要爬数据?
To quote Wikipedia
The key element that distinguishes data scraping from regular parsing is that the output being scraped was intended for display to and end-user, rather than as input to another program, and is therefore usually neither documented nor structured for convenient parsing.
- 爬取整站思路:使用图遍历算法
- 爬取更新思路:找列表页,不断刷新获得更新
如何获得列表页? 通过爬取整站,通过机器学习,查找列表页