Skip to content

Instantly share code, notes, and snippets.

@nielsvr
Created November 5, 2018 12:08
Show Gist options
  • Save nielsvr/10c97a72c554e4d9ff50046f1a854131 to your computer and use it in GitHub Desktop.
Save nielsvr/10c97a72c554e4d9ff50046f1a854131 to your computer and use it in GitHub Desktop.
Bad bots list for htaccess or vhosts.conf
RewriteCond %{HTTP_USER_AGENT} \
12soso|\
192\.comagent|\
1noonbot|\
1on1searchbot|\
3de\_search2|\
3d\_search|\
3g\ bot|\
3gse|\
50\.nu|\
a1\ sitemap\ generator|\
a1\ website\ download|\
a6\-indexer|\
aasp|\
abachobot|\
abonti|\
abotemailsearch|\
aboundex|\
aboutusbot|\
accmonitor\ compliance\ server|\
accoon|\
achulkov\.net\ page\ walker|\
acme\.spider|\
acoonbot|\
acquia\-crawler|\
activetouristbot|\
ad\ muncher|\
adamm\ bot|\
adbeat\_bot|\
adminshop\.com|\
advanced\ email\ extractor|\
aesop\_com\_spiderman|\
aespider|\
af\ knowledge\ now\ verity\ spider|\
aggregator:vocus|\
ah\-ha\.com\ crawler|\
ahrefs|\
aibot|\
aidu|\
aihitbot|\
aipbot|\
aisiid|\
aitcsrobot/1\.1|\
ajsitemap|\
akamai\-sitesnapshot|\
alexawebsearchplatform|\
alexfdownload|\
alexibot|\
alkalinebot|\
all\ acronyms\ bot|\
alpha\ search\ agent|\
amerla\ search\ bot|\
amfibibot|\
ampmppc\.com|\
amznkassocbot|\
anemone|\
anonymous|\
anotherbot|\
answerbot|\
answerbus|\
answerchase\ prove|\
antbot|\
antibot|\
antisantyworm|\
antro\.net|\
aonde\-spider|\
aport|\
appengine\-google|\
appid\:\ s\~stremor\-crawler\-|\
aqua\_products|\
arabot|\
arachmo|\
arachnophilia|\
archive\.org\_bot|\
aria\ equalizer|\
arianna\.libero\.it|\
arikus\_spider|\
art\-online\.com|\
artavisbot|\
artera|\
asaha\ search\ engine\ turkey|\
ask|\
aspider|\
aspseek|\
asterias|\
astrofind|\
athenusbot|\
atlocalbot|\
atomic\_email\_hunter|\
attach|\
attrakt|\
attributor|\
augurfind|\
auresys|\
autobaron\ crawler|\
autoemailspider|\
autowebdir|\
avsearch\-|\
axfeedsbot|\
axonize\-bot|\
ayna|\
b2w|\
backdoorbot|\
backrub|\
backstreet\ browser|\
backweb|\
baidu|\
bandit|\
batchftp|\
baypup|\
bdfetch|\
becomebot|\
becomejpbot|\
beetlebot|\
bender|\
besserscheitern\-crawl|\
betabot|\
big\ brother|\
big\ data|\
bigado\.com|\
bigcliquebot|\
bigfoot|\
biglotron|\
bilbo|\
bilgibetabot|\
bilgibot|\
bintellibot|\
bitlybot|\
bitvouseragent|\
bizbot003|\
bizbot04|\
bizworks\ retriever|\
black\ hole|\
black\.hole|\
blackbird|\
blackmask\.net\ search\ engine|\
blackwidow|\
bladder\ fusion|\
blaiz\-bee|\
blexbot|\
blinkx|\
blitzbot|\
blog\ conversation\ project|\
blogmyway|\
blogpulselive|\
blogrefsbot|\
blogscope|\
blogslive|\
bloobybot|\
blowfish|\
blt|\
bnf\.fr\_bot|\
boaconstrictor|\
boardreader|\
boia\-scan\-agent|\
boia\.org|\
boitho|\
boi\_crawl\_00|\
bookmark\ buddy\ bookmark\ checker|\
bookmark\ search\ tool|\
bosug|\
bot\ apoena|\
botalot|\
botrighthere|\
botswana|\
bottybot|\
bpbot|\
braintime\_search|\
brokenlinkcheck\.com|\
browseremulator|\
browsermob|\
bruinbot|\
bsearchr&d|\
bspider|\
btbot|\
btsearch|\
bubing|\
buddy|\
buibui|\
buildcms\ crawler|\
builtbottough|\
bullseye|\
bumblebee|\
bunnyslippers|\
buscadorclarin|\
buscaplus\ robi|\
butterfly|\
buyhawaiibot|\
buzzbot|\
byindia|\
byspider|\
byteserver|\
bzbot|\
c\ r\ a\ w\ l\ 3\ r|\
cacheblaster|\
caddbot|\
cafi|\
camcrawler|\
camelstampede|\
canon\-webrecord|\
careerbot|\
cataguru|\
catchbot|\
cazoodle|\
ccbot|\
ccgcrawl|\
ccubee|\
cd\-preload|\
ce\-preload|\
cegbfeieh|\
cerberian\ drtrs|\
cert\ figleafbot|\
cfetch|\
cfnetwork|\
chameleon|\
charlotte|\
check&get|\
checkbot|\
checklinks|\
cheesebot|\
chemiede\-nodebot|\
cherrypicker|\
chilkat|\
chinaclaw|\
cipinetbot|\
cis455crawler|\
citeseerxbot|\
cizilla|\
clariabot|\
climate\ ark|\
climateark\ spider|\
clshttp|\
clushbot|\
coast\ scan\ engine|\
coast\ webmaster\ pro|\
coccoc|\
collapsarweb|\
collector|\
colocrossing|\
combine|\
connectsearch|\
conpilot|\
contentsmartz|\
contextad\ bot|\
contype|\
cookienet|\
coolbot|\
coolcheck|\
copernic|\
copier|\
copyrightcheck|\
core\-project|\
cosmos|\
covario\-ids|\
cowbot\-|\
cowdog\ bot|\
crabbybot|\
craftbot\@yahoo\.com|\
crawler\.kpricorn\.org|\
crawler43\.ejupiter\.com|\
crawler4j|\
crawler@|\
crawler\_for\_infomine|\
crawly|\
crawl\_application|\
creativecommons|\
crescent|\
cs\-crawler|\
cse\ html\ validator|\
cshttpclient|\
cuasarbot|\
culsearch|\
curl|\
custo|\
cvaulev|\
cyberdog|\
cybernavi\_webget|\
cyberpatrol\ sitecat\ webbot|\
cyberspyder|\
cydralspider|\
d1garabicengine|\
datacha0s|\
datafountains|\
dataparksearch|\
dataprovider\.com|\
datascape\ robot|\
dataspearspiderbot|\
dataspider|\
dattatec\.com|\
daumoa|\
dblbot|\
dcpbot|\
declumbot|\
deepindex|\
deepnet\ crawler|\
deeptrawl|\
dejan|\
del\.icio\.us\-thumbnails|\
deltascan|\
delvubot|\
der\ gro§e\ bildersauger|\
der\ große\ bildersauger|\
deusu|\
dfs\-fetch|\
diagem|\
diamond|\
dibot|\
didaxusbot|\
digext|\
digger|\
digi\-rssbot|\
digitalarchivesbot|\
digout4u|\
diibot|\
dillo|\
dir\_snatch\.exe|\
disco|\
distilled\-reputation\-monitor|\
djangotraineebot|\
dkimrepbot|\
dmoz\ downloader|\
docomo|\
dof\-verify|\
domaincrawler|\
domainscan|\
domainwatcher\ bot|\
dotbot|\
dotspotsbot|\
dow\ jones\ searchbot|\
download|\
doy|\
dragonfly|\
drip|\
drone|\
dtaagent|\
dtsearchspider|\
dumbot|\
dwaar|\
dxseeker|\
e\-societyrobot|\
eah|\
earth\ platform\ indexer|\
earth\ science\ educator\ \ robot|\
easydl|\
ebingbong|\
ec2linkfinder|\
ecairn\-grabber|\
ecatch|\
echoosebot|\
edisterbot|\
edugovsearch|\
egothor|\
eidetica\.com|\
eirgrabber|\
elblindo\ the\ blind\ bot|\
elisabot|\
ellerdalebot|\
email\ exractor|\
emailcollector|\
emailleach|\
emailsiphon|\
emailwolf|\
emeraldshield|\
empas\_robot|\
enabot|\
endeca|\
enigmabot|\
enswer\ neuro\ bot|\
enter\ user\-agent|\
entitycubebot|\
erocrawler|\
estylesearch|\
esyndicat\ bot|\
eurosoft\-bot|\
evaal|\
eventware|\
everest\-vulcan\ inc\.|\
exabot|\
exactsearch|\
exactseek|\
exooba|\
exploder|\
express\ webpictures|\
extractor|\
eyenetie|\
ez\-robot|\
ezooms|\
f\-bot\ test\ pilot|\
factbot|\
fairad\ client|\
falcon|\
fast\ data\ search\ document\ retriever|\
fast\ esp|\
fast\-search\-engine|\
fastbot\ crawler|\
fastbot\.de\ crawler|\
fatbot|\
favcollector|\
faviconizer|\
favorites\ sweeper|\
fdm|\
fdse\ robot|\
fedcontractorbot|\
fembot|\
fetch\ api\ request|\
fetch\_ici|\
fgcrawler|\
filangy|\
filehound|\
findanisp\.com\_isp\_finder|\
findlinks|\
findweb|\
firebat|\
firstgov\.gov\ search|\
flaming\ attackbot|\
flamingo\_searchengine|\
flashcapture|\
flashget|\
flickysearchbot|\
fluffy\ the\ spider|\
flunky|\
focused\_crawler|\
followsite|\
foobot|\
fooooo\_web\_video\_crawl|\
fopper|\
formulafinderbot|\
forschungsportal|\
francis|\
freewebmonitoring\ sitechecker|\
freshcrawler|\
freshdownload|\
freshlinks\.exe|\
friendfeedbot|\
frodo\.at|\
froggle|\
frontpage|\
froola\ bot|\
fr\_crawler|\
fu\-nbi|\
full\_breadth\_crawler|\
funnelback|\
furlbot|\
g10\-bot|\
gaisbot|\
galaxybot|\
gazz|\
gbplugin|\
generate\_infomine\_category\_classifiers|\
genevabot|\
geniebot|\
genieo|\
geomaxenginebot|\
geometabot|\
geonabot|\
geovisu|\
germcrawler\ |\
gethtmlcontents|\
getleft|\
getright|\
getsmart|\
geturl\.rexx|\
getweb!|\
giant|\
gigablastopensource|\
gigabot|\
girafabot|\
gleamebot|\
gnome\-vfs|\
go!zilla|\
go\-ahead\-got\-it|\
go\-http\-client|\
goforit\.com|\
goforitbot|\
gold\ crawler|\
goldfire\ server|\
golem|\
goodjelly|\
gordon\-college\-google\-mini|\
goroam|\
goseebot|\
gotit|\
govbot|\
gpu\ p2p\ crawler|\
grabber|\
grabnet|\
grafula|\
grapefx|\
grapeshot|\
grbot|\
greenyogi|\
gromit|\
grub|\
gsa|\
gslfbot|\
gulliver|\
gulperbot|\
gurujibot|\
gvc\ business\ crawler|\
gvc\ crawler|\
gvc\ search\ bot|\
gvc\ web\ crawler|\
gvc\ weblink\ crawler|\
gvc\ world\ links|\
gvcbot\.com|\
happyfunbot|\
harvest|\
hatena\ antenna|\
hawler|\
hcat|\
hclsreport\-crawler|\
hd\ nutch\ agent|\
header\_test\_client|\
healia\
[NC,OR]
#500 new rule
RewriteCond %{HTTP_USER_AGENT} \
helix|\
here\ will\ be\ link\ to\ crawler\ site|\
heritrix|\
hiscan|\
hisoftware\ accmonitor\ server|\
hisoftware\ accverify|\
hitcrawler|\
hivabot|\
hloader|\
hmsebot|\
hmview|\
hoge|\
holmes|\
homepagesearch|\
hooblybot\-image|\
hoowwwer|\
hostcrawler|\
hsft\ \\-\ link\ scanner|\
hsft\ \\-\ lvu\ scanner|\
hslide|\
ht://check|\
htdig|\
html\ link\ validator|\
htmlparser|\
httplib|\
httrack|\
huaweisymantecspider|\
hul\-wax|\
humanlinks|\
hyperestraier|\
hyperix|\
iaarchiver\-|\
ia\_archiver|\
ibuena|\
icab|\
icds\-ingestion|\
ichiro|\
icopyright\ conductor|\
ieautodiscovery|\
iecheck|\
ihwebchecker|\
iiitbot|\
iim\_405|\
ilsebot|\
iltrovatore|\
image\ stripper|\
image\ sucker|\
image\-fetcher|\
imagebot|\
imagefortress|\
imageshereimagesthereimageseverywhere|\
imagevisu|\
imds\_monitor|\
imo\-google\-robot\-intelink|\
inagist\.com\ url\ crawler|\
indexer|\
industry\ cortex\ webcrawler|\
indy\ library|\
indylabs\_marius|\
inelabot|\
inet32\ ctrl|\
inetbot|\
info\ seeker|\
infolink|\
infomine|\
infonavirobot|\
informant|\
infoseek\ sidewinder|\
infotekies|\
infousabot|\
ingrid|\
inktomi|\
insightscollector|\
insightsworksbot|\
inspirebot|\
insumascout|\
intelix|\
intelliseek|\
interget|\
internet\ ninja|\
internet\ radio\ crawler|\
internetlinkagent|\
interseek|\
ioi|\
ip\-web\-crawler\.com|\
ipadd\ bot|\
ipselonbot|\
ips\-agent|\
iria|\
irlbot|\
iron33|\
isara|\
isearch|\
isilox|\
istellabot|\
its\-learning\ crawler|\
iu\_csci\_b659\_class\_crawler|\
ivia|\
jadynave|\
java|\
jbot|\
jemmathetourist|\
jennybot|\
jetbot|\
jetbrains\ omea\ pro|\
jetcar|\
jim|\
jobo|\
jobspider\_ba|\
joc|\
joedog|\
joyscapebot|\
jspyda|\
junut\ bot|\
justview|\
jyxobot|\
k\.s\.bot|\
kakclebot|\
kalooga|\
katatudo\-spider|\
kbeta1|\
keepni\ web\ site\ monitor|\
kenjin\.spider|\
keybot\ translation\-search\-machine|\
keywenbot|\
keyword\ density|\
keyword\.density|\
kinjabot|\
kitenga\-crawler\-bot|\
kiwistatus|\
kmbot\-|\
kmccrew\ bot\ search|\
knight|\
knowitall|\
knowledge\ engine|\
knowledge\.com|\
koepabot|\
koninklijke|\
korniki|\
krowler|\
ksbot|\
kuloko\-bot|\
kulturarw3|\
kummhttp|\
kurzor|\
kyluka\ crawl|\
l\.webis|\
labhoo|\
labourunions411|\
lachesis|\
lament|\
lamerexterminator|\
lapozzbot|\
larbin|\
lbot|\
leaptag|\
leechftp|\
leechget|\
letscrawl\.com|\
lexibot|\
lexxebot|\
lftp|\
libcrawl|\
libiviacore|\
libw|\
likse|\
linguee\ bot|\
link\ checker|\
link\ validator|\
linkalarm|\
linkbot|\
linkcheck\ by\ siteimprove\.com|\
linkcheck\ scanner|\
linkchecker|\
linkdex\.com|\
linkextractorpro|\
linklint|\
linklooker|\
linkman|\
links\ sql|\
linkscan|\
linksmanager\.com\_bot|\
linksweeper|\
linkwalker|\
link\_checker|\
litefinder|\
litlrbot|\
little\ grabber\ at\ skanktale\.com|\
livelapbot|\
lm\ harvester|\
lmqueuebot|\
lnspiderguy|\
loadtimebot|\
localcombot|\
locust|\
lolongbot|\
lookbot|\
lsearch|\
lssbot|\
lt\ scotland\ checklink|\
ltx71.com|\
lwp|\
lycos\_spider|\
lydia\ entity\ spider|\
lynnbot|\
lytranslate|\
mag\-net|\
magnet|\
magpie\-crawler|\
magus\ bot|\
mail\.ru|\
mainseek\_bot|\
mammoth|\
map\ robot|\
markwatch|\
masagool|\
masidani\_bot\_|\
mass\ downloader|\
mata\ hari|\
mata\.hari|\
matentzn\ at\ cs\ dot\ man\ dot\ ac\ dot\ uk|\
maxamine\.com\-\-robot|\
maxamine\.com\-robot|\
maxomobot|\
mcbot|\
medrabbit|\
megite|\
memacbot|\
memo|\
mendeleybot|\
mercator\-|\
mercuryboard\_user\_agent\_sql\_injection\.nasl|\
metacarta|\
metaeuro\ web\ search|\
metager2|\
metagloss|\
metal\ crawler|\
metaquerier|\
metaspider|\
metaspinner|\
metauri|\
mfcrawler|\
mfhttpscan|\
midown\ tool|\
miixpc|\
mini\-robot|\
minibot|\
minirank|\
mirror|\
missigua\ locator|\
mister\ pix|\
mister\.pix|\
miva|\
mj12bot|\
mnogosearch|\
moduna\.com|\
mod\_accessibility|\
moget|\
mojeekbot|\
monkeycrawl|\
moses|\
mowserbot|\
mqbot|\
mse360|\
msindianwebcrawl|\
msmobot|\
msnptc|\
msrbot|\
mt\-soft|\
multitext|\
my\-heritrix\-crawler|\
myapp|\
mycompanybot|\
mycrawler|\
myengines\-us\-bot|\
myfamilybot|\
myra|\
my\_little\_searchengine\_project|\
nabot|\
najdi\.si|\
nambu|\
nameprotect|\
nasa\ search|\
natchcvs|\
natweb\-bad\-link\-mailer|\
naver|\
navroad|\
nearsite|\
nec\-meshexplorer|\
neosciocrawler|\
nerdbynature\.bot|\
nerdybot|\
nerima\-crawl-|\
nessus|\
nestreader|\
net\ vampire|\
net::trackback|\
netants|\
netcarta\ cyberpilot\ pro|\
netcraft|\
netexperts|\
netid\.com\ bot|\
netmechanic|\
netprospector|\
netresearchserver|\
netseer|\
netshift=|\
netsongbot|\
netsparker|\
netspider|\
netsrcherp|\
netzip|\
newmedhunt|\
news\ bot|\
newsgatherer|\
newsgroupreporter|\
newstrovebot|\
news\_search\_app|\
nextgensearchbot|\
nextthing\.org|\
nicebot|\
nicerspro|\
niki\-bot|\
nimblecrawler|\
nimbus\-1|\
ninetowns|\
ninja|\
njuicebot|\
nlese|\
nogate|\
norbert\ the\ spider|\
noteworthybot|\
npbot|\
nrcan\ intranet\ crawler|\
nsdl\_search\_bot|\
nuggetize\.com\ bot|\
nusearch\ spider|\
nutch|\
nu\_tch|\
nwspider|\
nymesis|\
nys\-crawler|\
objectssearch|\
obot|\
obvius\ external\ linkcheck|\
ocelli|\
octopus|\
odp\ entries\ t\_st|\
oegp|\
offline\ navigator|\
offline\.explorer|\
ogspider|\
omiexplorer\_bot|\
omniexplorer|\
omnifind|\
omniweb|\
onetszukaj|\
online\ link\ validator|\
oozbot|\
openbot|\
openfind|\
openintelligencedata|\
openisearch|\
openlink\ virtuoso\ rdf\ crawler|\
opensearchserver\_bot|\
opidig|\
optidiscover|\
oracle\ secure\ enterprise\ search|\
oracle\ ultra\ search|\
orangebot|\
orisbot|\
ornl\_crawler|\
ornl\_mercury|\
osis\-project\.jp|\
oso|\
outfoxbot|\
outfoxmelonbot|\
owler\-bot|\
owsbot|\
ozelot|\
p3p\ client|\
pagebiteshyperbot|\
pagebull|\
pagedown|\
pagefetcher|\
pagegrabber|\
pagepeeker|\
pagerank\ monitor|\
page\_verifier|\
pamsnbot\.htm|\
panopy\ bot|\
panscient\.com|\
pansophica|\
papa\ foto|\
paperlibot|\
parasite|\
parsijoo|\
pathtraq|\
pattern|\
patwebbot|\
pavuk|\
paxleframework|\
pbbot|\
pcbrowser|\
pcore\-http|\
pd\-crawler|\
penthesila|\
perform\_crawl|\
perman|\
personal\ ultimate\ crawler|\
php\ version\ tracker|\
phpcrawl|\
phpdig|\
picosearch|\
pieno\ robot|\
pipbot|\
pipeliner|\
pita|\
pixfinder|\
piyushbot|\
planetwork\ bot\ search|\
plucker|\
plukkie|\
plumtree|\
pockey|\
pocohttp|\
pogodak\.ba|\
pogodak\.co\.yu|\
poirot|\
polybot|\
pompos|\
poodle\ predictor|\
popscreenbot|\
postpost|\
privacyfinder|\
projectwf\-java\-test\-crawler|\
propowerbot|\
prowebwalker|\
proxem\ websearch|\
proximic|\
proxy\ crawler|\
psbot|\
pss\-bot|\
psycheclone|\
pub\-crawler|\
pucl|\
pulsebot|\
pump|\
pwebot|\
python|\
qeavis\ agent|\
qfkbot|\
qualidade|\
qualidator\.com\ bot|\
quepasacreep|\
queryn\ metasearch|\
queryn\.metasearch|\
quest\.durato|\
quintura\-crw|\
qunarbot|\
qwantify|\
qweerybot|\
qweery\_robot\.txt\_checkbot|\
r2ibot|\
r6\_commentreader|\
r6\_feedfetcher|\
r6\_votereader|\
rabot|\
radian6|\
radiation\ retriever|\
rampybot|\
rankivabot|\
rankur|\
rational\ sitecheck|\
rcstartbot|\
realdownload|\
reaper|\
rebi\-shoveler|\
recorder|\
redbot|\
redcarpet|\
reget|\
repomonkey|\
research\ robot|\
riddler|\
riight|\
risenetbot|\
riverglassscanner\
[NC,OR]
#1000 new rule
RewriteCond %{HTTP_USER_AGENT} \
robopal|\
robosourcer|\
robotek|\
robozilla|\
roger|\
rome\ client|\
rondello|\
rotondo|\
roverbot|\
rpt\-httpclient|\
rtgibot|\
rufusbot|\
runnk\ online\ rss\ reader|\
runnk\ rss\ aggregator|\
s2bot|\
safaribookmarkchecker|\
safednsbot|\
safetynet\ robot|\
saladspoon|\
sapienti|\
sapphireweb|\
sbider|\
sbl\-bot|\
scfcrawler|\
scich|\
scientificcommons\.org|\
scollspider|\
scooperbot|\
scooter|\
scoutjet|\
scrapebox|\
scrapy|\
scrawltest|\
screaming\ frog|\
scrubby|\
scspider|\
scumbot|\
search\ publisher|\
search\ x\-bot|\
search\-channel|\
search\-engine\-studio|\
search\.kumkie\.com|\
search\.updated\.com|\
search\.usgs\.gov|\
searcharoo\.net|\
searchblox|\
searchbot|\
searchengine|\
searchhippo\.com|\
searchit\-bot|\
searchmarking|\
searchmarks|\
searchmee!|\
searchmee\_v|\
searchmining|\
searchnowbot|\
searchpreview|\
searchspider\.com|\
searqubot|\
seb\ spider|\
seekbot|\
seeker\.lookseek\.com|\
seeqbot|\
seeqpod\-vertical\-crawler|\
selflinkchecker|\
semager|\
semanticdiscovery|\
semantifire|\
semisearch|\
semrushbot|\
seoengworldbot|\
seokicks|\
seznambot|\
shablastbot|\
shadowwebanalyzer|\
shareaza|\
shelob|\
sherlock|\
shim\-crawler|\
shopsalad|\
shopwiki|\
showlinks|\
showyoubot|\
siclab|\
silk|\
simplepie|\
siphon|\
sitebot|\
sitecheck|\
sitefinder|\
siteguardbot|\
siteorbiter|\
sitesnagger|\
sitesucker|\
sitesweeper|\
sitexpert|\
skimbot|\
skimwordsbot|\
skreemrbot|\
skywalker|\
sleipnir|\
slow\-crawler|\
slysearch|\
smart\-crawler|\
smartdownload|\
smarte\ bot|\
smartwit\.com|\
snake|\
snap\.com\ beta\ crawler|\
snapbot|\
snappreviewbot|\
snappy|\
snookit|\
snooper|\
snoopy|\
societyrobot|\
socscibot|\
soft411\ directory|\
sogou|\
sohu\ agent|\
sohu\-search|\
sokitomi\ crawl|\
solbot|\
sondeur|\
sootle|\
sosospider|\
space\ bison|\
space\ fung|\
spacebison|\
spankbot|\
spanner|\
spatineo\ monitor\ controller|\
spatineo\ serval\ controller|\
spatineo\ serval\ getmapbot|\
special\_archiver|\
speedy|\
sphere\ scout|\
sphider|\
spider\.terranautic\.net|\
spiderengine|\
spiderku|\
spiderman|\
spinn3r|\
spinne|\
sportcrew\-bot|\
sproose|\
spyder3\.microsys\.com|\
sq\ webscanner|\
sqlmap|\
squid\-prefetch|\
squidclamav\_redirector|\
sqworm|\
srevbot|\
sslbot|\
ssm\ agent|\
stackrambler|\
stardownloader|\
statbot|\
statcrawler|\
statedept\-crawler|\
steeler|\
stegmann\-bot|\
stero|\
stripper|\
stumbler|\
suchclip|\
sucker|\
sumeetbot|\
sumitbot|\
summizebot|\
summizefeedreader|\
sunrise\ xp|\
superbot|\
superhttp|\
superlumin\ downloader|\
superpagesbot|\
supremesearch\.net|\
supybot|\
surdotlybot|\
surf|\
surveybot|\
suzuran|\
swebot|\
swish\-e|\
sygolbot|\
synapticwalker|\
syntryx\ ant\ scout\ chassis\ pheromone|\
systemsearch\-robot|\
szukacz|\
s\~stremor\-crawler|\
t\-h\-u\-n\-d\-e\-r\-s\-t\-o\-n\-e|\
tailrank|\
takeout|\
talkro\ web\-shot|\
tamu\_crawler|\
tapuzbot|\
tarantula|\
targetblaster\.com|\
targetyournews\.com\ bot|\
tausdatabot|\
taxinomiabot|\
teamsoft\ wininet\ component|\
tecomi\ bot|\
teezirbot|\
teleport|\
telesoft|\
teradex\ mapper|\
teragram\_crawler|\
terrawizbot|\
testbot|\
testing\ of\ bot|\
textbot|\
thatrobotsite\.com|\
the\ dyslexalizer|\
the\ intraformant|\
the\.intraformant|\
thenomad|\
theophrastus|\
theusefulbot|\
thumbbot|\
thumbnail\.cz\ robot|\
thumbshots\-de\-bot|\
tigerbot|\
tighttwatbot|\
tineye|\
titan|\
to\-dress\_ru\_bot\_|\
to\-night\-bot|\
tocrawl|\
topicalizer|\
topicblogs|\
toplistbot|\
topserver\ php|\
topyx\-crawler|\
touche|\
tourlentascanner|\
tpsystem|\
traazi|\
transgenikbot|\
travel\-search|\
travelbot|\
travellazerbot|\
treezy|\
trendiction|\
trex|\
tridentspider|\
trovator|\
true\_robot|\
tscholarsbot|\
tsm\ translation\-search\-machine|\
tswebbot|\
tulipchain|\
turingos|\
turnitinbot|\
tutorgigbot|\
tweetedtimes\ bot|\
tweetmemebot|\
twengabot|\
twice|\
twikle|\
twinuffbot|\
twisted\ pagegetter|\
twitturls|\
twitturly|\
tygobot|\
tygoprowler|\
typhoeus|\
u\.s\.\ government\ printing\ office|\
uberbot|\
ucb\-nutch|\
udmsearch|\
ufam\-crawler\-|\
ultraseek|\
unchaos|\
unisterbot|\
unidentified|\
unitek\ uniengine|\
universalsearch|\
unwindfetchor|\
uoftdb\_experiment|\
updated|\
url\ control|\
url\-checker|\
urlappendbot|\
urlblaze|\
urlchecker|\
urlck|\
urldispatcher|\
urlspiderpro|\
urly\ warning|\
urly\.warning|\
url\_gather|\
usaf\ afkn\ k2spider|\
usasearch|\
uss\-cosmix|\
usyd\-nlp\-spider|\
vacobot|\
vacuum|\
vadixbot|\
vagabondo|\
validator|\
valkyrie|\
vbseo|\
vci\ webviewer\ vci\ webviewer\ win32|\
verbstarbot|\
vericitecrawler|\
verifactrola|\
verity\-url\-gateway|\
vermut|\
versus\ crawler|\
versus\.integis\.ch|\
viasarchivinginformation\.html|\
vipr|\
virus\-detector|\
virus\_detector|\
visbot|\
vishal\ for\ clia|\
visweb|\
vital\ search'n\ urchin|\
vlad|\
vlsearch|\
voilabot|\
vmbot|\
vocusbot|\
voideye|\
voil|\
vortex|\
voyager|\
vspider|\
w3c\-webcon|\
w3c\_unicorn|\
w3search|\
wacbot|\
wanadoo|\
wastrix|\
water\ conserve\ portal|\
water\ conserve\ spider|\
watzbot|\
wauuu|\
wavefire|\
waypath|\
wazzup|\
wbdbot|\
web\ ceo\ online\ robot|\
web\ crawler|\
web\ downloader|\
web\ image\ collector|\
web\ link\ validator|\
web\ magnet|\
web\ site\ downloader|\
web\ sucker|\
web\-agent|\
web\-sniffer|\
web\.image\.collector|\
webaltbot|\
webauto|\
webbot|\
webbul\-bot|\
webcapture|\
webcheck|\
webclipping\.com|\
webcollage|\
webcopier|\
webcopy|\
webcorp|\
webcrawl\.net|\
webcrawler|\
webdatacentrebot|\
webdownloader\ for\ x|\
webdup|\
webemailextrac|\
webenhancer|\
webfetch|\
webgather|\
webgo\ is|\
webgobbler|\
webimages|\
webinator\-search2|\
webinator\-wbi|\
webindex|\
weblayers|\
webleacher|\
weblexbot|\
weblinker|\
weblyzard|\
webmastercoffee|\
webmasterworld\ extractor|\
webmasterworldforumbot|\
webminer|\
webmoose|\
webot|\
webpix|\
webreaper|\
webripper|\
websauger|\
webscan|\
websearchbench|\
website|\
webspear|\
websphinx|\
webspider|\
webster|\
webstripper|\
webtrafficexpress|\
webtrends\ link\ analyzer|\
webvac|\
webwalk|\
webwasher|\
webwatch|\
webwhacker|\
webxm|\
webzip|\
weddings\.info|\
wenbin|\
wep\ search|\
wepa|\
werelatebot|\
wget|\
whacker|\
whirlpool\ web\ engine|\
whowhere\ robot|\
widow|\
wikiabot|\
wikio|\
wikiwix\-bot\-|\
winhttp|\
wire|\
wisebot|\
wisenutbot|\
wish\-la|\
wish\-project|\
wisponbot|\
wmcai\-robot|\
wminer|\
wmsbot|\
woriobot|\
worldshop|\
worqmada|\
wotbox|\
wume\_crawler|\
www\ collector|\
www\-collector\-e|\
www\-mechanize|\
wwwoffle|\
wwwrobot|\
wwwster|\
wwwwanderer|\
wwwxref|\
wysigot|\
x\-clawler|\
x\-crawler|\
xaldon|\
xenu|\
xerka\ metabot|\
xerka\ webbot|\
xget|\
xirq|\
xmarksfetch|\
xqrobot|\
y!j|\
yacy\.net|\
yacybot|\
yanga\ worldsearch\ bot|\
yarienavoir\.net|\
yasaklibot|\
yats\ crawler|\
ybot|\
yebolbot|\
yellowjacket|\
yeti|\
yolinkbot|\
yooglifetchagent|\
yoono|\
yottacars\_bot|\
yourls|\
z\-add\ link\ checker|\
zagrebin|\
zao|\
zedzo\.validate|\
zermelo|\
zeus|\
zibber\-v|\
zimeno|\
zing-bottabot|\
zipppbot|\
zongbot|\
zoomspider|\
zotag\ search|\
zsebot|\
zuibot|\
zyborg|\
zyte\
[NC]
RewriteRule .* - [F]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment