Skip to content

Instantly share code, notes, and snippets.

@magnetikonline
Last active August 29, 2015 13:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save magnetikonline/9497941 to your computer and use it in GitHub Desktop.
Save magnetikonline/9497941 to your computer and use it in GitHub Desktop.
Webalizer IgnoreAgent/SearchEngine rule sets conf - 2014-03
IgnoreAgent +http://www.baidu.com/search/spider.html)
IgnoreAgent ; 360Spider
IgnoreAgent ; Claritybot)
IgnoreAgent ; Google Web Preview)
IgnoreAgent ; Googlebot
IgnoreAgent Aboundex/*
IgnoreAgent Apache-HttpAsyncClient/*
IgnoreAgent Apache-HttpClient/*
IgnoreAgent AppEngine-Google*
IgnoreAgent Baiduspider*
IgnoreAgent binlar*
IgnoreAgent CatchBot/*
IgnoreAgent CCBot/*
IgnoreAgent CFNetwork/
IgnoreAgent crawler4j*
IgnoreAgent curl/*
IgnoreAgent Daumoa/
IgnoreAgent DoCoMo/*
IgnoreAgent Evernote Clip Resolver
IgnoreAgent facebookexternalhit/*
IgnoreAgent findlinks/*
IgnoreAgent FyberSpider/*
IgnoreAgent Googlebot*
IgnoreAgent GoogleProducer*
IgnoreAgent ia_archiver*
IgnoreAgent ichiro/*
IgnoreAgent iTunes/*
IgnoreAgent Java*
IgnoreAgent libwww-perl/*
IgnoreAgent Linguee Bot*
IgnoreAgent LinksCrawler*
IgnoreAgent Microsoft Office Protocol Discovery
IgnoreAgent Morfeus*
IgnoreAgent Mozilla 5.0 (compatible; Google-Site-Verification/*
IgnoreAgent Mozilla/3.0 (compatible; Indy Library)
IgnoreAgent Mozilla/4.0 (compatible; Vagabondo/*
IgnoreAgent Mozilla/5.0 (compatible) Feedfetcher-Google*
IgnoreAgent Mozilla/5.0 (compatible; AhrefsBot/*
IgnoreAgent Mozilla/5.0 (compatible; aiHitBot/*
IgnoreAgent Mozilla/5.0 (compatible; archive.org_bot*
IgnoreAgent Mozilla/5.0 (compatible; bingbot/*
IgnoreAgent Mozilla/5.0 (compatible; Blekkobot*
IgnoreAgent Mozilla/5.0 (compatible; BLEXBot/*
IgnoreAgent Mozilla/5.0 (compatible; Butterfly/*
IgnoreAgent Mozilla/5.0 (compatible; coccoc/*
IgnoreAgent Mozilla/5.0 (compatible; CompSpyBot/*
IgnoreAgent Mozilla/5.0 (compatible; Dataprovider*
IgnoreAgent Mozilla/5.0 (compatible; DotBot/*
IgnoreAgent Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html)
IgnoreAgent Mozilla/5.0 (compatible; Exabot*
IgnoreAgent Mozilla/5.0 (compatible; Ezooms/*
IgnoreAgent Mozilla/5.0 (compatible; Google Desktop/*
IgnoreAgent Mozilla/5.0 (compatible; Google-Site-Verification/*
IgnoreAgent Mozilla/5.0 (compatible; IstellaBot/*
IgnoreAgent Mozilla/5.0 (compatible; linkdexbot/*
IgnoreAgent Mozilla/5.0 (compatible; LinkpadBot/*
IgnoreAgent Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/*
IgnoreAgent Mozilla/5.0 (compatible; meanpathbot/*
IgnoreAgent Mozilla/5.0 (compatible; MJ12bot/*
IgnoreAgent Mozilla/5.0 (compatible; oBot/*
IgnoreAgent Mozilla/5.0 (compatible; SearchmetricsBot; http://www.searchmetrics.com/en/searchmetrics-bot/)
IgnoreAgent Mozilla/5.0 (compatible; SemrushBot/*
IgnoreAgent Mozilla/5.0 (compatible; Seznam*
IgnoreAgent Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)
IgnoreAgent Mozilla/5.0 (compatible; SiteExplorer/*
IgnoreAgent Mozilla/5.0 (compatible; Snipebot/*
IgnoreAgent Mozilla/5.0 (compatible; SolomonoBot/*
IgnoreAgent Mozilla/5.0 (compatible; spbot/*
IgnoreAgent Mozilla/5.0 (compatible; SWbot/*
IgnoreAgent Mozilla/5.0 (compatible; U; AnyEvent-HTTP/*
IgnoreAgent Mozilla/5.0 (compatible; URLAppendBot/*
IgnoreAgent Mozilla/5.0 (compatible; WBSearchBot/*
IgnoreAgent Mozilla/5.0 (compatible; Yahoo*
IgnoreAgent Mozilla/5.0 (compatible; Yandex*
IgnoreAgent Mozilla/5.0 (compatible; YYSpider*
IgnoreAgent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/*
IgnoreAgent Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google*
IgnoreAgent Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.9.0.13) Gecko/2009073022 Firefox/3.5.2 (.NET CLR 3.5.30729) SurveyBot/*
IgnoreAgent msnbot*
IgnoreAgent NCBot*
IgnoreAgent NerdyBot
IgnoreAgent netEstate*
IgnoreAgent NextGenSearchBot*
IgnoreAgent Nutch*
IgnoreAgent nutch*
IgnoreAgent PagesInventory*
IgnoreAgent panscient.com
IgnoreAgent PEAR*
IgnoreAgent PHP/*
IgnoreAgent psbot/*
IgnoreAgent PyCrawler
IgnoreAgent PycURL/*
IgnoreAgent Python-urllib/*
IgnoreAgent ScreenerBot*
IgnoreAgent SeznamBot/*
IgnoreAgent Sogou*
IgnoreAgent TurnitinBot/*
IgnoreAgent Twitterbot/*
IgnoreAgent W3C_Validator/*
IgnoreAgent Wget/*
IgnoreAgent Wotbox/*
IgnoreAgent www.webwombat.com.au
IgnoreAgent Xenu Link Sleuth/*
IgnoreAgent xpymep.exe
IgnoreAgent yacybot*
IgnoreAgent Yeti/*
IgnoreAgent YisouSpider
IgnoreAgent ZemlyaCrawl/*
IgnoreAgent Zend_Http_Client
IgnoreAgent ZmEu
SearchEngine ask.com q=
SearchEngine baidu.com wd=
SearchEngine bing. q=
SearchEngine duckduckgo.com q=
SearchEngine facebook.com q=
SearchEngine google. q=
SearchEngine image.youdao.com q=
SearchEngine m.yahoo. p=
SearchEngine search-results.com q=
SearchEngine search.alot. q=
SearchEngine search.aol. q=
SearchEngine search.avg.com q=
SearchEngine search.babylon.com q=
SearchEngine search.comcast.net q=
SearchEngine search.conduit. q=
SearchEngine search.daum.net q=
SearchEngine search.incredibar.com q=
SearchEngine search.lycos. q=
SearchEngine search.yahoo. p=
SearchEngine webcache.googleusercontent.com q=
SearchEngine yandex.kz text=
SearchEngine yandex.ru text=
HideReferrer file://
HideReferrer ://r.duckduckgo.
HideReferrer ://search.daum.net
HideReferrer ://semalt.com
HideReferrer ://www.bing.
HideReferrer ://www.google.
HideReferrer baidu.com
HideReferrer search.yahoo.com/search
# Referrer spammers
IgnoreReferrer .blog.blikk.hu/
IgnoreReferrer ://alldownload.pw
IgnoreReferrer ://blogs.rediff.com/
IgnoreReferrer ://ddlmega.net
IgnoreReferrer ://hand-made-soaps.com
IgnoreReferrer ://loadopia.com
IgnoreReferrer ://onload.pw
IgnoreReferrer ://pony-business.com/
IgnoreReferrer ://r-e-f-e-r-e-r.com/
IgnoreReferrer ://smarts-loans.com/
IgnoreReferrer ://www.dlsit.ro/
IgnoreReferrer ://www.liveinternet.ru/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment