Skip to content

Instantly share code, notes, and snippets.

@tony-brewerio
Created November 28, 2011 12:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tony-brewerio/4245b231d844a4303b77 to your computer and use it in GitHub Desktop.
Save tony-brewerio/4245b231d844a4303b77 to your computer and use it in GitHub Desktop.
Content-aware image cropping with ChunkyPNG

Robocrop: attempt at content-aware image cropping

Basic idea is to find some area on the image that stands out the most from the rest of the image. There are quite a few ways of doing this, and I decided to use only difference of color and cotrast levels, for each block vs entire image.

I use stddev of colors of 3x3 pixel block with the center on the pixel as a pixel's 'contrast' here.

robocrop.rb - as simple as possible

The first version of robocrop is simple and only supports 100x100 blocks, you can't control how block weight is calculated etc.

There are two weights used here to determine what block to use in the end.

First is simply an average contrast level of block's pixels. This promotes blocks with high contrast, such as cat's fur and folds on dog's skin.

Using only this gives passable results for cat/dog, but fails on the ducks, since ground have higher contrast there.

Second is diversity of block's pixels contrast. This weight promotes blocks that has areas of both low and high contrast, making algorithm to more likely capture the 'edges' of objects.

In the case of 'cat' and 'dog', adding diversity to weight will make cropped image to contain some of the blurry background behind the cat/dog, producing better results. It also helps on ducks, since the duck itself has a low contrast, while the ground has high, algorithm will try to fit both duck and the ground into a block, resulting in a block centered on duck's body.

robocrop2.rb - as a library

Same algorithm, packed as a new method of Image class, :robocrop ( and :robocrop! ), that allow to override some parameters. Accepts hash of arguments.

:width, :height - dimensions of resulting canvas.

:contrast, :diversity - additional weights that specify how much average contrast and diversity contributes to block weight.

:precision - specifies how many blocks are actually checked. Higher precision means more accurate results, but worse performance. For :precision => X, at most X*X blocks will be tested.

Random thoughts

Despite being very slow, this algorithm is simple and usually produce ok results. Interesting is that there is little to no point in comparing block's color versus image color, using only contrast is good enough, if not better.

Also, I like how algorithm captures the edges. Similar result could be accomplished by using some edge detection library ( imagemagick for example ) first, but it would add complexity to the solution.

I use NArray for calclulations, since its fast and does have nice api.

require 'rubygems'
require 'chunky_png'
require 'narray'
def robocrop(original_filename, cropped_filename)
image = ChunkyPNG::Image.from_file(original_filename)
# create 3 width*height matrixes, that stores red, green or blue channel of each pixel
image_rgb = NArray.byte(3, image.width, image.height)
# fill the colors matrix
image.width.times do |x|
image_rgb[true, x, true] = image.column(x).collect do |color|
ChunkyPNG::Color.to_truecolor_bytes(color)
end
end
# calculate 'contrast' for every pixel of the image
# i use stddev of colors of 3x3 pixel block with the center on the pixel as a pixel's 'contrast' here
# stddev is calculated separately for every color channel
image_contrast = NArray.float(image.width, image.height)
image.width.times do |x|
image.height.times do |y|
image_contrast[x, y] = image_rgb[
true,
([0, x-1].max)..([image.width-1, x+1].min),
([0, y-1].max)..([image.height-1, y+1].min)
].stddev(1..2).sum
end
end
# for every 100x100 block possible, find its top-left most pixel
x_step = 1
y_step = 1
northwest_pixels = (0..(image.width-100)).step(x_step).to_a
northwest_pixels = northwest_pixels.product((0..(image.height-100)).step(y_step).to_a)
# iterate over all possible blocks
crop_x, crop_y, block_weight = northwest_pixels.collect {|x, y|
# contrast levels for pixels of this block
block_contrast = image_contrast[x...(x+100), y...(y+100)]
contrast = block_contrast.mean
contrast_diversity = block_contrast.stddev
# capture block coords with its weight
[x, y, contrast * contrast_diversity]
}.max_by{|x, y, weight| weight}
# crop the image and save it with a new filename
image.crop!(crop_x, crop_y, 100, 100)
image.save(cropped_filename)
end
robocrop('cat.png', 'cat_cropped.png')
robocrop('dog.png', 'dog_cropped.png')
robocrop('duck.png', 'duck_cropped.png')
require 'rubygems'
require 'chunky_png'
require 'narray'
class ChunkyPNG::Image
def robocrop(options = {})
dup.robocrop!(options)
end
def robocrop!(options = {})
options = {
# dimensions
:width => 100,
:height => 100,
# weights
:contrast => 1,
:diversity => 1,
# complexity
:precision => 100,
}.merge!(options)
# create 3 width*height matrixes, that stores red, green or blue channel of each pixel
image_rgb = NArray.byte(3, width, height)
# fill the colors matrix
width.times do |x|
image_rgb[true, x, true] = column(x).collect do |color|
ChunkyPNG::Color.to_truecolor_bytes(color)
end
end
# calculate 'contrast' for every pixel of the image
# i use stddev of colors of 3x3 pixel block with the center on the pixel as a pixel's 'contrast' here
# stddev is calculated separately for every color channel
image_contrast = NArray.float(width, height)
width.times do |x|
height.times do |y|
image_contrast[x, y] = image_rgb[
true,
([0, x-1].max)..([width-1, x+1].min),
([0, y-1].max)..([height-1, y+1].min)
].stddev(1..2).sum
end
end
# for every crop_width*crop_height block possible, find its top-left most pixel
x_step = [(width-options[:width]).to_f/options[:precision], 1].max
y_step = [(height-options[:height]).to_f/options[:precision], 1].max
northwest_pixels = (0..(width-options[:width])).step(x_step).to_a
northwest_pixels = northwest_pixels.product((0..(height-options[:height])).step(y_step).to_a)
# iterate over all possible blocks
crop_x, crop_y, block_weight = northwest_pixels.collect {|x, y|
# contrast levels for pixels of this block
block_contrast = image_contrast[x...(x+options[:width]), y...(y+options[:height])]
contrast = block_contrast.mean ** options[:contrast]
contrast_diversity = block_contrast.stddev ** options[:diversity]
# capture block coords with its weight
[x, y, contrast * contrast_diversity]
}.max_by{|x, y, weight| weight}
crop!(crop_x, crop_y, options[:width], options[:height])
end
end
@rogerbraun
Copy link

Inefficient, but great results. The code is a bit hard to read. Could you have use a Canvas instead of an NArray? Also, edge detection is not as complicated as you might think it is ;-)

@tony-brewerio
Copy link
Author

Thanks.
I also think that some parts are kinda hard to follow.
Especially 'northwest_pixels', but the only way I see to fix it would involve making 2-level deep .times loop kinda like 'image_contrast' one. It will be more straightforward, but I like current more functional approach better.

About usage of NArray.
I just cant think of a way to avoid it here, the code is complex enough with implementation of matrices etc hidden in separate library.
Without NArray, I would have to implement matrix slicing, stddev and mean by myself. And stddev is calculated over 2 dimensions of 3d matrix, so even more complexity.
And it is already slow enough, without NArray things would be many times worse.

I wanted to add edge detection as some kind of another weight, but decided not to, since results are already good and there are a lot of edge detection cropping examples out there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment