i writing small experimental application requires data scraped number of websites.
currently have added random delay (2-20 sec) between subsequent requests , using multiple user agent strings. what else can done web scraper evade detection?
for example, there advantage in setting referrer or x-forwarded-for headers.
maybe use tor , change frequency update ip address more often?
Comments
Post a Comment