Skip to content

Instantly share code, notes, and snippets.

@hamletbatista
Created July 24, 2019 21:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hamletbatista/d88d9ce74c7b6950c3ff6cf3b79c36eb to your computer and use it in GitHub Desktop.
Save hamletbatista/d88d9ce74c7b6950c3ff6cf3b79c36eb to your computer and use it in GitHub Desktop.
from google.colab import files
files.upload()
#practicalecommerce.com-ssl_log-Jul-2019.log.gz(application/x-gzip) - 6663880 bytes, last modified: 7/20/2019 - 100% done
# Saving practicalecommerce.com-ssl_log-Jul-2019.log.gz to practicalecommerce.com-ssl_log-Jul-2019.log.gz
#{'practicalecommerce.com-ssl_log-Jul-2019.log.gz': b'practicalecommerce.com-ssl_log-Jul-2019.log'}
!gunzip practicalecommerce.com-ssl_log-Jul-2019.log.gz
#let's review the log
!head practicalecommerce.com-ssl_log-Jul-2019.log
#66.249.69.196 - - [30/Jun/2019:08:05:31 -0400] "GET /Crawl-Your-Ecommerce-Site-with-Python-Scrapy- HTTP/1.1" 301 287 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#66.249.69.196 - - [30/Jun/2019:08:05:32 -0400] "GET /category/design-development HTTP/1.1" 200 13646 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#66.249.69.196 - - [30/Jun/2019:08:05:32 -0400] "GET /Crawl-Your-Ecommerce-Site-with-Python-Scrapy HTTP/1.1" 200 15693 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#66.249.69.196 - - [30/Jun/2019:08:05:38 -0400] "GET /wp-content/uploads/2015/05/practical-ecommerce-icon.png-144 HTTP/1.1" 404 9536 "-" "Googlebot-Image/1.0"
#35.187.180.136 - - [30/Jun/2019:08:05:44 -0400] "GET /author/melissa-mackey/feed HTTP/1.1" 200 1578 "http://webmarketingtoday.com/author/melissa-mackey/feed" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#66.249.69.196 - - [30/Jun/2019:08:05:46 -0400] "GET /post_google_news.xml HTTP/1.1" 304 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#35.187.86.35 - - [30/Jun/2019:08:06:09 -0400] "GET /author/dr-ralph-f-wilson/feed HTTP/1.1" 200 6529 "http://webmarketingtoday.com/author/dr-ralph-f-wilson/feed" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#66.249.69.200 - - [30/Jun/2019:08:06:23 -0400] "GET /real-benefit-amazon-reviews?amp%2525252525252525253BlastReferrer=www.avalara.com&amp%2525252525252525253BsessionId=1541527244400&amp%2525252525252525253Bwcmmode=disabled HTTP/1.1" 200 13558 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#66.249.69.198 - - [30/Jun/2019:08:06:23 -0400] "GET /amazon-posts-stellar-2018-financial-results-2019-not-as-bright HTTP/1.1" 200 15324 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
#18.202.174.25 - - [30/Jun/2019:08:06:32 -0400] "GET /13-Top-Brands-on-Instagram-for-Inspiration HTTP/1.1" 200 16174 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment