Skip to content

Instantly share code, notes, and snippets.

@Snake-Sanders
Last active June 29, 2017 12:13
Show Gist options
  • Save Snake-Sanders/f2d8ea9e13192590d8f400b8973e3bca to your computer and use it in GitHub Desktop.
Save Snake-Sanders/f2d8ea9e13192590d8f400b8973e3bca to your computer and use it in GitHub Desktop.
Convert Docx for Office 365.
require "fileutils"
# Version 3.1
#
# Description: This script solves the problem when opening docx files with the new office 360. Some documents report "Xml parsing error"
# This is due to an deprecated xml tag. This script parses the docx and removes them.
#
# Requires 7zip installed and added to the path.
#
# usage:
# ruby convertToDocx365 file.docx
#
# This will generate a converted docx file
# Remember to open the docx and refresh the index table
# @ToDo replace 7zip and use a gem to compres the file.
# Begin:
# requires a xml filename as parameter to replace its content
module ConvertToDocx365
def self.Convert file_name
puts "file to open : #{file_name}"
# required to match preserve and preserver
match_patern = /xml:space="preserv(er"|e")/
text = File.read(file_name)
text.gsub!(match_patern, "" )
# To write changes to the file, use:
File.open(file_name, "w") {|file| file.puts text }
puts "matches were replaced OK"
return true
end
def self.run( cmd )
puts cmd
system cmd
end
end
puts "Convertions begins:"
docx_file = ARGV[0]
if( docx_file.index(' ') != nil)
puts "the file name contains spaces"
abort
end
target_doc = 'document.xml'
target_dir = "#{docx_file}_temp"
puts (File.basename docx_file)
puts "extracting #{target_doc}"
# x = extract using folder structure
# -r = recursive
# -oc = output folder
CMD_UNZIP = "7z x #{docx_file} -oc:#{target_dir} -r"
ConvertToDocx365::run CMD_UNZIP
puts "Replacing paterns"
res = ConvertToDocx365::Convert File.join( '.', target_dir, 'word', target_doc)
if res then
# rename the original file to "old"
CMD_MOVE = "mv #{docx_file} #{docx_file}.old"
ConvertToDocx365::run CMD_MOVE
# a = add to zip
CMD_ZIP = "7z a #{docx_file} .\\#{target_dir}\\* -r"
ConvertToDocx365::run CMD_ZIP
puts "Done"
else
puts "Failed"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment