友情提示:欢迎光临,本博客提供的代码,请粘贴到EditPlus 3中使用!!请使用火狐,Chrome浏览器进行浏览网站!出售wenca.cn域名,有要的请M我qq:316358892

.htaccess阻止坏爬虫

post by rocdk890 / 2012-1-10 10:25 Tuesday linux技术

可以根据 HTTP_USER_AGENT 来判断它们。把自己的agent设置为常用浏览器标识,比如 “Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)” ,就没办法了。

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:rocdk890@gmail.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures .......

阅读全文>>

标签: apache .htaccess 伪静态 rewrite robots 阻止

评论(2) 引用(0) 浏览(2054)