Skip to content

Instantly share code, notes, and snippets.

Created September 20, 2012 00:20
Show Gist options
  • Save magnetikonline/3753193 to your computer and use it in GitHub Desktop.
Save magnetikonline/3753193 to your computer and use it in GitHub Desktop.
Webalizer IgnoreAgent/SearchEngine rule sets conf - 2012-05
IgnoreAgent Aboundex/*
IgnoreAgent AdsBot-Google*
IgnoreAgent Aghaven/*
IgnoreAgent Alltop/*
IgnoreAgent AppEngine-Google*
IgnoreAgent Apple-PubSub/*
IgnoreAgent AppleSyndication/*
IgnoreAgent Baiduspider*
IgnoreAgent bitlybot
IgnoreAgent blogged_crawl/*
IgnoreAgent CatchBot/*
IgnoreAgent CCBot/*
IgnoreAgent Content Crawler
IgnoreAgent Covario-IDS/1.0*
IgnoreAgent DoCoMo/*
IgnoreAgent DomainCrawler/*
IgnoreAgent Domnutch-Bot/*
IgnoreAgent EC2LinkFinder
IgnoreAgent eCairn-Grabber/*
IgnoreAgent EdisterBot*
IgnoreAgent Eurobot/*
IgnoreAgent facebookexternalhit/*
IgnoreAgent facebookplatform/*
IgnoreAgent Feedfetcher-Google*
IgnoreAgent Feedshow/*
IgnoreAgent findlinks/*
IgnoreAgent Gigabot/*
IgnoreAgent Gist Server
IgnoreAgent Googlebot*
IgnoreAgent Gootkit auto-rooter scanner
IgnoreAgent GSLFbot
IgnoreAgent HolmesBot (
IgnoreAgent HuaweiSymantecSpider/*
IgnoreAgent Huaweisymantecspider*
IgnoreAgent ia_archiver*
IgnoreAgent ichiro/*
IgnoreAgent intelium_bot
IgnoreAgent Jakarta Commons-HttpClient/*
IgnoreAgent Java*
IgnoreAgent JumbleBot/*
IgnoreAgent LexxeBot/*
IgnoreAgent librabot/*
IgnoreAgent libwww-perl/*
IgnoreAgent LinkedInBot/*
IgnoreAgent LinksManager.com_bot
IgnoreAgent Made by ZmEu*
IgnoreAgent magpie-crawler/*
IgnoreAgent MagpieRSS/*
IgnoreAgent MetaURI API/*
IgnoreAgent Microsoft Data Access Internet Publishing Provider DAV 1.1
IgnoreAgent Microsoft Office Protocol Discovery
IgnoreAgent MLBot*
IgnoreAgent Mozilla/4.0 (compatible;
IgnoreAgent Mozilla/4.0 (compatible; Vagabondo/*
IgnoreAgent Mozilla/5.0 (compatible; 008/*
IgnoreAgent Mozilla/5.0 (compatible; AhrefsBot/*
IgnoreAgent Mozilla/5.0 (compatible; aiHitBot/*
IgnoreAgent Mozilla/5.0 (compatible; ApptusBot/*
IgnoreAgent Mozilla/5.0 (compatible; archive.org_bot*
IgnoreAgent Mozilla/5.0 (compatible; Ask Jeeves/*
IgnoreAgent Mozilla/5.0 (compatible; Baiduspider/*
IgnoreAgent Mozilla/5.0 (compatible; bingbot/*
IgnoreAgent Mozilla/5.0 (compatible; Birubot/*
IgnoreAgent Mozilla/5.0 (compatible; Blekkobot*
IgnoreAgent Mozilla/5.0 (compatible; BlogScope/*
IgnoreAgent Mozilla/5.0 (compatible; Butterfly/*
IgnoreAgent Mozilla/5.0 (compatible; discobot/*
IgnoreAgent Mozilla/5.0 (compatible; DotBot/*
IgnoreAgent Mozilla/5.0 (compatible; Embedly/*
IgnoreAgent Mozilla/5.0 (compatible; Exabot*
IgnoreAgent Mozilla/5.0 (compatible; Ezooms/*
IgnoreAgent Mozilla/5.0 (compatible; Funnelback*
IgnoreAgent Mozilla/5.0 (compatible; Googlebot/*
IgnoreAgent Mozilla/5.0 (compatible; Google Desktop/*
IgnoreAgent Mozilla/5.0 (compatible; heritrix/*
IgnoreAgent Mozilla/5.0 (compatible; JikeSpider*
IgnoreAgent Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Exabot-Thumbnails)
IgnoreAgent Mozilla/5.0 (compatible; lemurwebcrawler*
IgnoreAgent Mozilla/5.0 (compatible; LinksManager.com_bot*
IgnoreAgent Mozilla/5.0 (compatible; Linux; Socialradarbot/*
IgnoreAgent Mozilla/5.0 (compatible; Lipperhey*
IgnoreAgent Mozilla/5.0 (compatible; ltbot/*
IgnoreAgent Mozilla/5.0 (compatible; MJ12bot/*
IgnoreAgent Mozilla/5.0 (compatible; MSIE 6.0b; Windows NT 5.0) Gecko/2009011913 Firefox/3.0.6 TweetmemeBot
IgnoreAgent Mozilla/5.0 (compatible; MSIE or Firefox mutant*
IgnoreAgent Mozilla/5.0 (compatible; NerdByNature.Bot*
IgnoreAgent Mozilla/5.0 (compatible; oBot/*
IgnoreAgent Mozilla/5.0 (compatible; OpenindexDeepSpider/*
IgnoreAgent Mozilla/5.0 (compatible; OpenindexShallowSpider/*
IgnoreAgent Mozilla/5.0 (compatible; PaperLiBot/*
IgnoreAgent Mozilla/5.0 (compatible; Plukkie/*
IgnoreAgent Mozilla/5.0 (compatible; PrintfulBot/*
IgnoreAgent Mozilla/5.0 (compatible; Purebot/*
IgnoreAgent Mozilla/5.0 (compatible; ScoutJet*
IgnoreAgent Mozilla/5.0 (compatible; Search17Bot/*
IgnoreAgent Mozilla/5.0 (compatible; Seznam*
IgnoreAgent Mozilla/5.0 (compatible; sindice-fetcher/*
IgnoreAgent Mozilla/5.0 (compatible; SISTRIX Crawler;
IgnoreAgent Mozilla/5.0 (compatible; SiteBot/*
IgnoreAgent Mozilla/5.0 (compatible; spbot/*
IgnoreAgent Mozilla/5.0 (compatible; SpiderLing*
IgnoreAgent Mozilla/5.0 (compatible; SWEBot/*
IgnoreAgent Mozilla/5.0 (compatible; TweetedTimes Bot/*
IgnoreAgent Mozilla/5.0 (compatible; TweetmemeBot/*
IgnoreAgent Mozilla/5.0 (compatible; Voluniabot/*
IgnoreAgent Mozilla/5.0 (compatible; WBSearchBot/*
IgnoreAgent Mozilla/5.0 (compatible; woriobot*
IgnoreAgent Mozilla/5.0 (compatible; XML Sitemaps Generator*
IgnoreAgent Mozilla/5.0 (compatible; Yahoo*
IgnoreAgent Mozilla/5.0 (compatible; Yandex*
IgnoreAgent Mozilla/5.0 (compatible; YodaoBot/*
IgnoreAgent Mozilla/5.0 (compatible; YoudaoBot/*
IgnoreAgent Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv: Gecko/2009073022 Firefox/3.5.2 (.NET CLR 3.5.30729) SurveyBot/*
IgnoreAgent Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) Speedy Spider*
IgnoreAgent Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 (
IgnoreAgent Mozilla/5.0 (Yahoo-MMCrawler/*
IgnoreAgent msnbot*
IgnoreAgent NetNewsWire/*
IgnoreAgent NetworkedBlogs*
IgnoreAgent NextGenSearchBot*
IgnoreAgent Nutraspace/*
IgnoreAgent OctoBot/*
IgnoreAgent Outlook*
IgnoreAgent PHP/*
IgnoreAgent PostPost*
IgnoreAgent PostRank/*
IgnoreAgent psbot/*
IgnoreAgent PycURL/*
IgnoreAgent Python-urllib/*
IgnoreAgent quickobot/*
IgnoreAgent R6_CommentReader*
IgnoreAgent R6_FeedFetcher*
IgnoreAgent radian6_default_(
IgnoreAgent SBIder/*
IgnoreAgent SemrushBot/*
IgnoreAgent SE/SE-0.1 (Aussie Search Spider*
IgnoreAgent SeznamBot/*
IgnoreAgent SimplePie/*
IgnoreAgent SiteSnagger
IgnoreAgent Sogou*
IgnoreAgent SolomonoBot/*
IgnoreAgent Sosospider*
IgnoreAgent Summify (Summify/*
IgnoreAgent Superfeedr: Superparser bot/*
IgnoreAgent TestNutch/*
IgnoreAgent Trapit/*
IgnoreAgent TurnitinBot/*
IgnoreAgent TwengaBot*
IgnoreAgent Twitterbot/*
IgnoreAgent UniversalFeedParser/*
IgnoreAgent UnwindFetchor/*
IgnoreAgent Web Crawler*
IgnoreAgent Wget/*
IgnoreAgent Windows-RSS-Platform/*
IgnoreAgent woobot/*
IgnoreAgent WordPress/*
IgnoreAgent yacybot*
IgnoreAgent Yahoo*
IgnoreAgent Yandex/*
IgnoreAgent Yeti/*
IgnoreAgent ZmEu
IgnoreAgent ZumBot/*
SearchEngine q=
SearchEngine bing. q=
SearchEngine google. q=
SearchEngine q=
SearchEngine p=
SearchEngine search.alot. q=
SearchEngine q=
SearchEngine q=
SearchEngine q=
SearchEngine search.conduit. q=
SearchEngine q=
SearchEngine p=
SearchEngine q=
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment