Skip to content

Instantly share code, notes, and snippets.

@DanBradbury
Last active August 29, 2015 13:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DanBradbury/10208060 to your computer and use it in GitHub Desktop.
Save DanBradbury/10208060 to your computer and use it in GitHub Desktop.
Scraping Google Images

Cause we swag like that

page = agent.get('http://images.google.com/images?q=puppy')

Then find the appropriate url link that is oversized (hmm)

/url?q=http://www.123callingalldogs.com/FAQ&sa=U&ei=FIREU9LuHaaW2QWLyoD4BA&ved=0CDAQ9QEwAQ&usg=AFQjCNGa0jrNnHSiJ3QywWF1aQaDXbXmiA

The image we want is http://www.123callingalldogs.com/savedimages/macexplorer_com-puppy-dog-26.jpg so we are still pretty far off.

&sa=U&ei=FIREU9LuHaaW2QWLyoD4BA&ved=0CDAQ9QEwAQ&usg=AFQjCNGa0jrNnHSiJ3QywWF1aQaDXbXmiA

***
#<Mechanize::Page
 {url
  #<URI::HTTP:0x00000102e29638 URL:http://www.bubblews.com/news/2703786-national-puppy-day>}
 {meta_refresh}
 {title "National Puppy Day! - News - Bubblews"}
 {iframes}
 {frames}
 {links
  #<Mechanize::Page::Link "All News" "/news/top">
  #<Mechanize::Page::Link "Anime" "/news/category/32">
  #<Mechanize::Page::Link "Art" "/news/category/29">
  #<Mechanize::Page::Link "Articles " "/news/category/30">
  #<Mechanize::Page::Link "Beauty " "/news/category/31">
  #<Mechanize::Page::Link "Business" "/news/category/8">
  #<Mechanize::Page::Link "Cool" "/news/category/2">
  #<Mechanize::Page::Link "Crime" "/news/category/26">
  #<Mechanize::Page::Link "Entertainment" "/news/category/5">
  #<Mechanize::Page::Link "Fashion" "/news/category/10">
  #<Mechanize::Page::Link "Food" "/news/category/4">
  #<Mechanize::Page::Link "Funny" "/news/category/11">
  #<Mechanize::Page::Link "Gaming" "/news/category/6">
  #<Mechanize::Page::Link "Health" "/news/category/13">
  #<Mechanize::Page::Link "Ideology" "/news/category/25">
  #<Mechanize::Page::Link "Journal" "/news/category/36">
  #<Mechanize::Page::Link "Movies" "/news/category/34">
  #<Mechanize::Page::Link "Music" "/news/category/24">
  #<Mechanize::Page::Link "News" "/news/category/1">
  #<Mechanize::Page::Link "Personal" "/news/category/15">
  #<Mechanize::Page::Link "Pics" "/news/category/33">
  #<Mechanize::Page::Link "Politics" "/news/category/3">
  #<Mechanize::Page::Link "Random" "/news/category/18">
  #<Mechanize::Page::Link "Reviews" "/news/category/28">
  #<Mechanize::Page::Link "Science" "/news/category/9">
  #<Mechanize::Page::Link "Sports" "/news/category/14">
  #<Mechanize::Page::Link "More..." "#">
  #<Mechanize::Page::Link "Bubblews" "/">
  #<Mechanize::Page::Link "Home" "/">
  #<Mechanize::Page::Link "Archives" "/news/archives">
  #<Mechanize::Page::Link "How it works" "/about">
  #<Mechanize::Page::Link "Redemption" "/redemption">
  #<Mechanize::Page::Link "Contact Us" "/contact">
  #<Mechanize::Page::Link "Submit" "/news/submit">
  #<Mechanize::Page::Link " New Bubblews" "/news/top">
  #<Mechanize::Page::Link " Archives" "/news/archives">
  #<Mechanize::Page::Link "Facebook" "https://www.facebook.com/Bubblews">
  #<Mechanize::Page::Link "Twitter" "https://twitter.com/BubblewsBlog">
  #<Mechanize::Page::Link "LinkedIn" "/">
  #<Mechanize::Page::Link "Login" "/">
  #<Mechanize::Page::Link "New Account" "/account/create">
  #<Mechanize::Page::Link "Forgot Password?" "/account/forgot">
  #<Mechanize::Page::Link "24" "/news/like/2703786">
  #<Mechanize::Page::Link "0" "/news/dislike/2703786">
  #<Mechanize::Page::Link "Tweet" "https://twitter.com/share">
  #<Mechanize::Page::Link "Share on Tumblr" "http://www.tumblr.com/share">
  #<Mechanize::Page::Link
   ""
   "//pinterest.com/pin/create/button/?url=http://www.bubblews.com//news/2703786-national-puppy-day&media=/assets/images/news/135129463_1395451022.jpg&description=National Puppy Day! ">
  #<Mechanize::Page::Link "" "/assets/images/news/135129463_1395451022.jpg">
  #<Mechanize::Page::Link "hnatalieann" "/account/50480-hnatalieann">
  #<Mechanize::Page::Link "Random" "/news/category/18">
  #<Mechanize::Page::Link
   "+National-Puppy-Day"
   "/pulses/776318-national-puppy-day">
  #<Mechanize::Page::Link "+Animal-shelters" "/pulses/194335-animal-shelters">
  #<Mechanize::Page::Link "+Puppy-mills" "/pulses/126580-puppy-mills">
  #<Mechanize::Page::Link "+Dog-breeders" "/pulses/575560-dog-breeders">
  #<Mechanize::Page::Link "" "/assets/images/news/108109919_1395451022.jpg">
  #<Mechanize::Page::Link
   "Last Interview With Peaches Geldof Was Chilling "
   "/news/2901071-last-interview-with-peaches-geldof-was-chilling">
  #<Mechanize::Page::Link
   "Peaches Geldof Dies At The Age Of 25"
   "/news/2899968-peaches-geldof-dies-at-the-age-of-25">
  #<Mechanize::Page::Link
   "Barbra Walter's Announces Retirement Date "
   "/news/2899696-barbra-walter039s-announces-retirement-date">
  #<Mechanize::Page::Link
   "Jeffery Dahmer's Childhood Home Up For Sale "
   "/news/2897298-jeffery-dahmer039s-childhood-home-up-for-sale">
  #<Mechanize::Page::Link
   "Mickey Rooney Passed Away "
   "/news/2895762-mickey-rooney-passed-away">
  #<Mechanize::Page::Link "Add comment" "#">
  #<Mechanize::Page::Link
   "dartoz on March 22nd, 2014 @ 10:29 am"
   "/account/78469-dartoz">
  #<Mechanize::Page::Link "&hnatalieann" "/account/50480-hnatalieann">
  #<Mechanize::Page::Link "1" "/comments/like/12758317">
  #<Mechanize::Page::Link "0" "/comments/dislike/12758317">
  #<Mechanize::Page::Link
   "Flag this comment as inappropriate"
   "/comments/flag/12758317">
  #<Mechanize::Page::Link
   "LavenderRose on March 22nd, 2014 @ 12:00 am"
   "/account/30175-lavenderrose">
  #<Mechanize::Page::Link "1" "/comments/like/12735163">
  #<Mechanize::Page::Link "0" "/comments/dislike/12735163">
  #<Mechanize::Page::Link
   "Flag this comment as inappropriate"
   "/comments/flag/12735163">
  #<Mechanize::Page::Link
   "dartoz on March 21st, 2014 @ 09:38 pm"
   "/account/78469-dartoz">
  #<Mechanize::Page::Link "1" "/comments/like/12727977">
  #<Mechanize::Page::Link "0" "/comments/dislike/12727977">
  #<Mechanize::Page::Link
   "Flag this comment as inappropriate"
   "/comments/flag/12727977">
  #<Mechanize::Page::Link
   "dartoz on March 21st, 2014 @ 09:38 pm"
   "/account/78469-dartoz">
  #<Mechanize::Page::Link "2" "/comments/like/12727968">
  #<Mechanize::Page::Link "0" "/comments/dislike/12727968">
  #<Mechanize::Page::Link
   "Flag this comment as inappropriate"
   "/comments/flag/12727968">
  #<Mechanize::Page::Link
   "Pat_Anthony on March 21st, 2014 @ 09:27 pm"
   "/account/34745-patanthony">
  #<Mechanize::Page::Link "3" "/comments/like/12727359">
  #<Mechanize::Page::Link "0" "/comments/dislike/12727359">
  #<Mechanize::Page::Link
   "Flag this comment as inappropriate"
   "/comments/flag/12727359">
  #<Mechanize::Page::Link "Home" "/">
  #<Mechanize::Page::Link "Welcome video" "/splash">
  #<Mechanize::Page::Link "Archives" "/news/archives">
  #<Mechanize::Page::Link "How it works" "/about">
  #<Mechanize::Page::Link "Redemption" "/redemption">
  #<Mechanize::Page::Link "Contact Us" "/contact">
  #<Mechanize::Page::Link "Terms of use" "/terms">
  #<Mechanize::Page::Link "Privacy Policy" "/privacy">
  #<Mechanize::Page::Link "Bubble news" "/">}
 {forms
  #<Mechanize::Form
   {name nil}
   {method "GET"}
   {action "/search"}
   {fields [text:0x816404fc type: text name: s value: Search]}
   {radiobuttons}
   {checkboxes}
   {file_uploads}
   {buttons [submit:0x816402e0 type: submit name:  value: Search]}>
  #<Mechanize::Form
   {name nil}
   {method "POST"}
   {action "http://www.bubblews.com/news/2703786-national-puppy-day"}
   {fields
    [text:0x8163cc30 type: text name: login[username] value: Username]
    [field:0x8163ca28 type: password name: login[password] value: Password]}
   {radiobuttons}
   {checkboxes}
   {file_uploads}
   {buttons}>
  #<Mechanize::Form
   {name nil}
   {method "POST"}
   {action "http://www.bubblews.com/news/2703786-national-puppy-day"}
   {fields [textarea:0x81638bd0 type:  name: content value: ]}
   {radiobuttons}
   {checkboxes}
   {file_uploads}
   {buttons}>}>
***
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment