Skip to content

Instantly share code, notes, and snippets.

@magnetikonline
Last active December 19, 2015 09:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save magnetikonline/5931509 to your computer and use it in GitHub Desktop.
Save magnetikonline/5931509 to your computer and use it in GitHub Desktop.
Webalizer IgnoreAgent/SearchEngine rule sets conf - 2013-07
IgnoreAgent +http://www.baidu.com/search/spider.html)
IgnoreAgent ; 360Spider
IgnoreAgent ; Claritybot)
IgnoreAgent ; Google Web Preview)
IgnoreAgent ; Googlebot-Mobile/
IgnoreAgent Aboundex/*
IgnoreAgent Apache-HttpClient/*
IgnoreAgent AppEngine-Google*
IgnoreAgent Baiduspider*
IgnoreAgent binlar*
IgnoreAgent CatchBot/*
IgnoreAgent CCBot/*
IgnoreAgent Content Crawler Spider
IgnoreAgent crawler4j*
IgnoreAgent curl/*
IgnoreAgent DoCoMo/*
IgnoreAgent DomainCrawler/*
IgnoreAgent Evernote Clip Resolver
IgnoreAgent facebookexternalhit/*
IgnoreAgent fastbot crawler*
IgnoreAgent findlinks/*
IgnoreAgent FlightDeckReportsBot/*
IgnoreAgent FyberSpider/*
IgnoreAgent Gigabot/*
IgnoreAgent Googlebot*
IgnoreAgent GoogleProducer*
IgnoreAgent ia_archiver*
IgnoreAgent ichiro/*
IgnoreAgent Influencebot/*
IgnoreAgent ip-web-crawler.com
IgnoreAgent IrssiUrlLog/*
IgnoreAgent Java*
IgnoreAgent larbin*
IgnoreAgent libwww-perl/*
IgnoreAgent Linguee Bot*
IgnoreAgent linkdex.com/*
IgnoreAgent LinksCrawler*
IgnoreAgent Microsoft Office Protocol Discovery
IgnoreAgent Morfeus*
IgnoreAgent Mozilla 5.0 (compatible; Google-Site-Verification/*
IgnoreAgent Mozilla/3.0 (compatible; Indy Library)
IgnoreAgent Mozilla/4.0 (compatible; Vagabondo/*
IgnoreAgent Mozilla/5.0 (compatible) Feedfetcher-Google*
IgnoreAgent Mozilla/5.0 (compatible; 008/*
IgnoreAgent Mozilla/5.0 (compatible; 4SeoHuntBot*
IgnoreAgent Mozilla/5.0 (compatible; AcoonBot/*
IgnoreAgent Mozilla/5.0 (compatible; AhrefsBot/*
IgnoreAgent Mozilla/5.0 (compatible; aiHitBot/*
IgnoreAgent Mozilla/5.0 (compatible; archive.org_bot*
IgnoreAgent Mozilla/5.0 (compatible; awcheckBot*
IgnoreAgent Mozilla/5.0 (compatible; Baiduspider/*
IgnoreAgent Mozilla/5.0 (compatible; bingbot/*
IgnoreAgent Mozilla/5.0 (compatible; Blekkobot*
IgnoreAgent Mozilla/5.0 (compatible; Butterfly/*
IgnoreAgent Mozilla/5.0 (compatible; coccoc/*
IgnoreAgent Mozilla/5.0 (compatible; CompSpyBot/*
IgnoreAgent Mozilla/5.0 (compatible; Dataprovider*
IgnoreAgent Mozilla/5.0 (compatible; DCPbot/*
IgnoreAgent Mozilla/5.0 (compatible; discoverybot/*
IgnoreAgent Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html)
IgnoreAgent Mozilla/5.0 (compatible; Exabot*
IgnoreAgent Mozilla/5.0 (compatible; Ezooms/*
IgnoreAgent Mozilla/5.0 (compatible; Google Desktop/*
IgnoreAgent Mozilla/5.0 (compatible; Google-Site-Verification/*
IgnoreAgent Mozilla/5.0 (compatible; Googlebot/*
IgnoreAgent Mozilla/5.0 (compatible; JikeSpider*
IgnoreAgent Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/*
IgnoreAgent Mozilla/5.0 (compatible; ltbot/*
IgnoreAgent Mozilla/5.0 (compatible; Mail.RU_Bot/*
IgnoreAgent Mozilla/5.0 (compatible; meanpathbot/*
IgnoreAgent Mozilla/5.0 (compatible; MJ12bot/*
IgnoreAgent Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1) KomodiaBot/*
IgnoreAgent Mozilla/5.0 (compatible; Nigma.ru/*
IgnoreAgent Mozilla/5.0 (compatible; oBot/*
IgnoreAgent Mozilla/5.0 (compatible; Plukkie/*
IgnoreAgent Mozilla/5.0 (compatible; ProCogSEOBot/*
IgnoreAgent Mozilla/5.0 (compatible; SearchmetricsBot; http://www.searchmetrics.com/en/searchmetrics-bot/)
IgnoreAgent Mozilla/5.0 (compatible; SemrushBot/*
IgnoreAgent Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)
IgnoreAgent Mozilla/5.0 (compatible; SiteExplorer/*
IgnoreAgent Mozilla/5.0 (compatible; Snipebot/*
IgnoreAgent Mozilla/5.0 (compatible; SolomonoBot/*
IgnoreAgent Mozilla/5.0 (compatible; spbot/*
IgnoreAgent Mozilla/5.0 (compatible; Statsbot/*
IgnoreAgent Mozilla/5.0 (compatible; SWbot/*
IgnoreAgent Mozilla/5.0 (compatible; U; AnyEvent-HTTP/*
IgnoreAgent Mozilla/5.0 (compatible; WBSearchBot/*
IgnoreAgent Mozilla/5.0 (compatible; Yahoo*
IgnoreAgent Mozilla/5.0 (compatible; Yandex*
IgnoreAgent Mozilla/5.0 (compatible; YodaoBot/*
IgnoreAgent Mozilla/5.0 (compatible; YYSpider*
IgnoreAgent Mozilla/5.0 (FHScan*
IgnoreAgent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/*
IgnoreAgent Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google*
IgnoreAgent Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.9.0.13) Gecko/2009073022 Firefox/3.5.2 (.NET CLR 3.5.30729) SurveyBot/*
IgnoreAgent Mozilla/5.0 (YahooYSMcm/*
IgnoreAgent msnbot*
IgnoreAgent My Nutch Spider/*
IgnoreAgent MyNutchTest/*
IgnoreAgent Mysite/*
IgnoreAgent NCBot*
IgnoreAgent netEstate*
IgnoreAgent NextGenSearchBot*
IgnoreAgent Nitidum*
IgnoreAgent nutch*
IgnoreAgent OpenWebIndex/*
IgnoreAgent PagesInventory*
IgnoreAgent panscient.com
IgnoreAgent PEAR*
IgnoreAgent PHP/*
IgnoreAgent psbot/*
IgnoreAgent PyCrawler
IgnoreAgent PycURL/*
IgnoreAgent Python-urllib/*
IgnoreAgent QuerySeekerSpider*
IgnoreAgent RelSpider*
IgnoreAgent ScreenerBot*
IgnoreAgent SeznamBot/*
IgnoreAgent Sogou*
IgnoreAgent SolomonoBot/*
IgnoreAgent Sosospider*
IgnoreAgent Sosospider/
IgnoreAgent TosCrawler/*
IgnoreAgent TurnitinBot/*
IgnoreAgent Twitterbot/*
IgnoreAgent upBot/*
IgnoreAgent W3C_Validator/*
IgnoreAgent Wget/*
IgnoreAgent WocBot/*
IgnoreAgent WordPress.com mShots*
IgnoreAgent Wotbox/*
IgnoreAgent Xenu Link Sleuth/*
IgnoreAgent xpymep.exe
IgnoreAgent XRL/*
IgnoreAgent yacybot*
IgnoreAgent Yahoo*
IgnoreAgent Yeti/*
IgnoreAgent YisouSpider
IgnoreAgent Zend_Http_Client
IgnoreAgent zimmbot/*
SearchEngine ask.com q=
SearchEngine bing. q=
SearchEngine google. q=
SearchEngine image.youdao.com q=
SearchEngine m.yahoo. p=
SearchEngine search-results.com q=
SearchEngine search.alot. q=
SearchEngine search.aol. q=
SearchEngine search.avg.com q=
SearchEngine search.babylon.com q=
SearchEngine search.comcast.net q=
SearchEngine search.conduit. q=
SearchEngine search.incredibar.com q=
SearchEngine search.yahoo. p=
SearchEngine webcache.googleusercontent.com q=
SearchEngine yandex.ru text=
HideReferrer ://www.bing.
HideReferrer ://www.google.
# Referrer spammers
IgnoreReferrer ://blogs.rediff.com/
IgnoreReferrer ://r-e-f-e-r-e-r.com/
IgnoreReferrer ://top10onlinepokiesaustralia.com
IgnoreReferrer ://www.dlsit.ro/
IgnoreReferrer ://www.liveinternet.ru/
IgnoreReferrer .blog.blikk.hu/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment