Ask HN: Scaling a targeted web crawler beyond 500M pages/day
14 points
1/21/1970
20 hours ago
by honungsburk
Comments
faangguyindia
4 hours ago
4lx87
I'm curious, how do you deal with Cloudflare and similar anti-bot systems? Just keep shopping the job around to different proxies?
12 hours ago
If you want to access data from websites which prevent it, you gotta use a headless browser with Residential Proxy Network Like Bright Data (formerly Luminati).