Skip to content

Instantly share code, notes, and snippets.

@gsomoza
Last active May 14, 2020 10:09
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save gsomoza/6875674971040417a0e2 to your computer and use it in GitHub Desktop.
Save gsomoza/6875674971040417a0e2 to your computer and use it in GitHub Desktop.
Magento CLI Media Cleaner

Magento CLI Media Cleaner

CLI utilities to clean the Magento media folders.

Features:

  • Clean unused images from the product catalog.
  • Clean the product catalog image cache.
  • Ready to use: automatically reads settings from app/etc/local.xml
  • FAST: I used it to safely clean about 45,000 images in just a couple of minutes.

COMMANDS

cache                Cleans the Magento product images cache
products             Cleans images that are not referenced anymore by the Magento catalog
help                 Display global or [command] help documentation

The script assumes its located in the magento_webroot/shell folder. If located elsewhere, the --webroot option must be specified to the location of the Magento webroot. Run clean.rb help [command] for more information.

EXAMPLES

To clean images that are no longer being used in the product catalog, run the following command:

shell/clean.rb products

Sample output:

Searching for files...
> Found 12952 files.
Querying for images...
Connecting to database...
Reading settings from app/etc/local.xml ...
> Found 12959 images.
Finding files to clean up...
Progress |==========.........| 56% complete
## REPORT ##
- Orphaned: 1234 files
- Missing:  7 files
Would you like to see a list of missing images (Y/n)? y
/m/1/m1.jpg
/m/2/m2.jpg
/m/3/m3.jpg
/m/4/m4.jpg
/m/5/m5.jpg
All done!

You can also schedule cleaning as cron tasks. Simply append the -f option to force (accept all prompts) and optionally also -s to silence the program (display no output):

shell/clean.rb products -f -s

And as mentioned above, if you're not running the script on the webroot/shell folder of your Magento installation, then all you need to do is specify the path to the webroot as follows:

~/clean.rb products --webroot=/var/www/example.com/public_html

GLOBAL OPTIONS

-f, --force
    Perform irreversible actions without asking for consent. WARNING: use with caution!

-s, --silent
    Do not output information. Must be combined with --force to truly disable all output.

-h, --help
    Display help documentation

-v, --version
    Display version information

-t, --trace
    Display backtrace when an error occurs

REQUIREMENTS

This scripts requires Ruby 2.0 or greater plus the following ruby gems:

  • nokogiri ~ 1.6
  • commander ~ 4.2
  • mysql2 ~ 0.3

AUTHOR

Gabriel Somoza <gabriel@strategery.io>

LICENSE

GNU General Public License, version 3 (GPLv3) - http://www.gnu.org/licenses/gpl-3.0.html
#!/usr/bin/env ruby
# Copyright (c) 2014 Gabriel Somoza <gabriel@strategery.io>
#
# LICENSE:
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
require 'rubygems'
require 'commander/import'
program :name, 'Magento CLI Media Cleaner'
program :version, '0.2.1'
program :description, 'CLI utilities to clean the Magento media folders.'
program :help, 'Author', 'Gabriel Somoza <gabriel@strategery.io>'
program :help, 'License', 'GNU General Public License, version 3 (GPLv3) - http://www.gnu.org/licenses/gpl-3.0.html'
global_option('-f', '--force', 'Perform irreversible actions without asking for consent. WARNING: use with caution!') { $force = true }
global_option('-s', '--silent', 'Do not output information. Must be combined with --force to truly disable all output.') { $silent = true }
command :products do |c|
c.syntax = "#{$0} products [options]"
c.summary = 'Cleans images that are not referenced anymore by the Magento catalog'
c.option '-w', '--webroot PATH', String, 'Path to Magento\'s webroot'
c.option '--mysql-socket PATH', String, 'Path to a MySQL socket'
c.action do |args, options|
require 'nokogiri'
require 'mysql2'
require 'find'
options.default :webroot => File.realpath(File.join(File.dirname(__FILE__), '..'))
Sgy::Magento::OrphanCleaner.new(options.webroot, {
:socket => options.mysql_socket,
:silent => $silent,
:force => $force,
}).run!
end
end
command :cache do |c|
c.syntax = "#{$0} cache [options]"
c.summary = 'Cleans the Magento product images cache'
c.option '-w', '--webroot PATH', String, 'Path to Magento\'s webroot'
c.action do |args, options|
options.default :webroot => File.realpath(File.join(File.dirname(__FILE__), '..'))
require 'fileutils'
Sgy::Magento::CacheCleaner.new(options.webroot, {
:silent => $silent,
:force => $force
}).run!
end
end
default_command :products
module Sgy
module Magento
class Settings
attr_accessor :xml
attr_accessor :settings
SETTING_KEYS = %w(username password host dbname table_prefix)
def initialize(webroot)
config_path = "#{webroot}/app/etc/local.xml"
@xml = ::Nokogiri::XML(File.read(config_path))
db_settings = xml.xpath('//config/global/resources/default_setup/connection')
@settings = {}
SETTING_KEYS.each do |key|
node = db_settings.css(key)
@settings[key] = node.text if node
end
end
def method_missing(meth, *args, &block)
@settings[meth.to_s] if SETTING_KEYS.include?(meth.to_s)
end
end
class BaseCleaner
def initialize(webroot, options = {})
@options = {
:socket => nil,
:silent => false,
:force => false
}.merge(options).merge({:webroot => webroot})
end
def app_settings
unless @settings
_say 'Reading settings from app/etc/local.xml ...'
@settings = Settings.new(@options[:webroot])
end
@settings
end
def connect!
_say 'Connecting to database...'
mysql_options = {
:username => app_settings.username,
:password => app_settings.password,
:database => app_settings.dbname,
:host => app_settings.host || 'localhost'
}
mysql_options[:socket] = @options[:socket] if @options.has_key?(:socket) && @options[:socket]
::Mysql2::Client.new(mysql_options)
end
def silent?; @options[:silent]; end
def silent=(v); @options[:silent] = !!v; end
def force?; @options[:force]; end
def force=(v); @options[:force] = !!v; end
def db
@db ||= connect!
end
def run!
raise 'Method must be implemented'
end
protected
def _say(message)
say message unless silent?
end
def _agree(message, default)
force? ? default : agree(message)
end
def _progress(items, &block)
if silent?
items.each {|item| yield item }
else
progress(items, &block)
end
end
end
##
# Finds orphaned product image files and automatically cleans them.
#
class OrphanCleaner < BaseCleaner
attr_reader :missing, :files
def run!
product_images_path = "#{@options[:webroot]}/media/catalog/product"
_say 'Searching for files...'
files = find_existing_files(product_images_path)
_say "<%= color('> Found #{files.size} files.', :green) %>"
_say 'Querying for images...'
images = db.query("SELECT value FROM #{app_settings[:table_prefix]}catalog_product_entity_media_gallery").collect{|i| i['value']}
_say "<%= color('> Found #{images.count} images.', :green) %>"
_say 'Finding files to clean up...'
missing = []
_progress images do |image|
index = files.index(image)
if index.nil?
missing << image
else
files.delete_at(index)
end
end
_say "<%= color('## REPORT ##', :green) %>"
_say "<%= color('- Orphaned: #{files.count} files', :green) %>"
_say "<%= color('- Missing: #{missing.count} files', :green) %>"
if files.count > 0 && _agree("<%= color('Delete orphaned files (Y/n)?', :red) %>", true)
_say 'Deleting...'
_progress files do |f|
path = File.realpath(product_images_path + f)
File.delete path if path
end
end
if _agree('Would you like to see a list of missing images (Y/n)?', false)
missing.each do |f|
# here we want to force the output even in silent mode because the user chose to see the output and
# therefore we must break the silence.
say f
end
end
_say 'All done!'
end
def find_existing_files(basedir)
files = []
Find.find(basedir) do |path|
if FileTest.directory?(path)
if File.basename(path) == 'cache'
Find.prune # don't look further into this directory
else
next
end
else # a file
files.push path.sub(basedir, '')
end
end
files
end
end
##
# Cleans the catalog image cache
#
class CacheCleaner < BaseCleaner
def run!
path = File.realpath("#{@options[:webroot]}/media/catalog/product/cache")
_say("Magento's image cache directory: #{path}")
if _agree("Are you sure you want to clear Magento's image cache (Y/n)?", true)
FileUtils.rmtree path, :secure => true
_say('Done!')
end
end
end
end
end
@kennylawrence
Copy link

fyi, the mage_prefix variable doesn't seem to contain the actual prefix. I had to add "mage_" to the beginning of the table name to get it to work.. Thanks for the script though!

@iluwQaa
Copy link

iluwQaa commented Sep 8, 2015

how to exclude a folder /media/catalog/product/placeholder ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment