Disallow specific User-agents’ requests. Useful for dealing with malicious bots.

To block requests based upon User-agent, add the following rule to nginx.conf:

server {
	listen 80;

	if ($http_user_agent ~* craftbot|download|extract|stripper|sucker|ninja|clshttp|webspider|leacher|collector|grabber|webpictures) {
		return 403;
	}

	... other directives
}

This rule would cause all requests from User-agents containing any of the pipe-delimited string values to result in a 403 Forbidden response.

Why Lose Memories?


In More Detail

There are plenty of malicious bots out there, as well as some legitimate ones that you simply don’t want spidering your site, and dealing with them effectively is important.

Bots that truly masquerade as a normal web browser (by passing, say, a valid User-agent string for Firefox on MacOS), are very difficult to deal with, but the vast majority do identify themselves accurately and can be addressed.

Legitimate bots like GoogleBot, BingBot, and the like, are what most sites generally care about, so you can save your website or web application needless processing by dissing these other User-agents ‘at the door’, and keep your back-end application from having to waste resources dealing with them.