Last active
August 29, 2015 14:22
-
-
Save marcomontes/0e6733d3ee8538bd5fdd to your computer and use it in GitHub Desktop.
Función que extrae una imagen desde un articulo importado de un blog, valida que exista y tenga URL correcta, ademas omite imágenes pixel (1x1)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def extract_image_from_entry(entry) | |
image_list = Magick::ImageList.new | |
images = Array.new | |
final_image = String.new | |
content = Nokogiri::HTML(entry.content || entry.summary) | |
content.xpath("//img").each do |img| | |
unless img.attributes['src'].blank? | |
image_src = img.attributes['src'].value | |
images = image_list.read(image_src) if validate_image_url(image_src) && validate_url_route(image_src) && !image_list.read(image_src).blank? | |
end | |
end | |
if !images.blank? | |
images.each do |image| | |
break if image.columns.to_i == 1 && image.rows.to_i == 1 | |
final_image = image.base_filename | |
end | |
end | |
!final_image.blank? ? final_image : nil | |
end | |
def validate_image_url(image_url) | |
regex = /^(http?|https):\/\/[a-z0-9-]+(\.[a-z0-9-]+)+(\/[\w-]+)*\/[\w-]+\.(gif|jpg|jpeg|png|bmp|GIF|JPEG|JPG|PNG|BMP|Gif|Jpg|Jpeg|Png|Bmp)$/ | |
image_url =~ regex ? true : false | |
end | |
def validate_url_route(image_url) | |
begin | |
url = Net::HTTP.get_response(URI.parse(image_url)) | |
url.response.msg.eql?("OK") ? true : false | |
rescue | |
false | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment