Created
March 15, 2012 17:44
-
-
Save mperham/2045565 to your computer and use it in GitHub Desktop.
Ruby script to update MySQL from Latin1 to UTF8 without data conversion
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
desc "convert a latin1 database with utf8 data into proper utf8" | |
task :convert_to_utf8 => :environment do | |
puts Time.now | |
dryrun = ENV['DOIT'] != '1' | |
conn = ActiveRecord::Base.connection | |
if dryrun | |
def conn.run_sql(sql) | |
puts(sql) | |
end | |
else | |
def conn.run_sql(sql) | |
puts(sql) | |
execute(sql) | |
end | |
end | |
conn.run_sql "ALTER DATABASE my_database CHARACTER SET utf8 collate utf8_unicode_ci" | |
# Don't covert views | |
VIEWS = /(view|_v$)/ | |
big = [] | |
# These are table_name => model_class mappings that aren't rails standard or | |
# tables that we don't wish to convert (table_name => true). | |
mapping = { :pos => Pos, | |
:categories_products => true, | |
:delayed_jobs => Delayed::Job, | |
:schema_migrations => true | |
}.with_indifferent_access | |
tables = (conn.tables - big).select { |tbl| tbl !~ VIEWS } | |
puts "Converting #{tables.inspect}" | |
#(tables - big).each do |tbl| | |
tables.each do |tbl| | |
a = ['CHARACTER SET utf8 COLLATE utf8_unicode_ci'] | |
b = [] | |
model = mapping[tbl] || tbl.to_s.classify.constantize | |
model.columns.each do |col| | |
type = col.sql_type | |
nullable = col.null ? '' : ' NOT NULL' | |
default = col.default ? " DEFAULT '#{col.default}'" : '' | |
case type | |
when /varchar/ | |
a << "CHANGE #{col.name} #{col.name} VARBINARY(#{col.limit})" | |
b << "CHANGE #{col.name} #{col.name} VARCHAR(#{col.limit}) CHARACTER SET utf8 COLLATE utf8_unicode_ci#{nullable}#{default}" | |
when /text/ | |
a << "CHANGE #{col.name} #{col.name} BLOB" | |
b << "CHANGE #{col.name} #{col.name} TEXT CHARACTER SET utf8 COLLATE utf8_unicode_ci#{nullable}#{default}" | |
end | |
end unless model == true | |
conn.run_sql "ALTER TABLE #{tbl} #{a.join(', ')}" | |
conn.run_sql "ALTER TABLE #{tbl} #{b.join(', ')}" if b.present? | |
end | |
puts Time.now | |
puts 'Done!' | |
end |
Works well! I expanded on the fork by @buffym and added automatic database name detection. Makes it a little easier when converting test databases and other staged environments.
Worked well!!
I added some modifications based on the fork by @tboyko
- added 'db' namespace around the rake task
- simplified logic around tables which don't have an active record model
- added list of tables which will be completely ignored
Added transaction support as well as recovery in case something fails mid conversion.
Wow thank you for doing this. I was not looking forward to building this code my self.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This was exactly what I needed. Thank you. I added quoting column names in this fork: https://gist.github.com/buffym/5677768 as I ran into a problem with one of my table's column names.