-
-
Save dpinney/f5de675274c9f1ad16df6794d614dea8 to your computer and use it in GitHub Desktop.
# We do this with Pages because it's the only thing that can correctly convert rtfd. | |
# textutil works on rtf files with no text in them but not on rtfd. | |
set my_paths to {"/path/to/first/file.rtfd", "/path/to/second/file.rtfd"} | |
repeat with my_path in my_paths | |
tell application "Pages" | |
set my_file to (my_path as POSIX file) | |
set my_name to name of (info for my_file) | |
set doc to open my_file | |
export doc as Microsoft Word to alias (my_path & ".docx" as POSIX file) | |
close doc | |
tell application "Finder" to delete my_file | |
end tell | |
end repeat |
Thanks for the feedback!
Looks like your first path "Users/.../Documents/Test_3.doc" is missing a slash at the beginning? I.e. it should start "/Users/..."?
Fixed with this:
`
set my_paths to {"/Users/.../Documents/Test_3.doc", "/Users/.../Documents/Test_4.doc"}
repeat with my_path in my_paths
tell application "Pages"
set my_file to (my_path as POSIX file)
set my_name to name of (info for my_file)
set doc to open my_file
export doc as formatted text to alias (my_path & ".rtfd" as POSIX file)
close doc
end tell
end repeat
`
Thanks again for the initial script.
Hello David, I'm trying to run the script for all files in an input folder using the following:
set inputFolder to (choose folder with prompt "Select Folder of Word Document files to convert:")
set outputFolder to (choose folder with prompt "Select Folder to save RTFD files")
set inputCount to 0
set outputCount to 0
The problem I'm finding is ensuring that pages will open AllFiles in the inputFolder and convert them to formatted text (rtfd)...
This is my rough attempt (but getting: "Pages got an error: File file :Test_1.doc wasn’t found." number -43 from file ":Test_1.doc" )
tell application "Finder"
set AllFiles to every file of folder inputFolder
end tell
repeat with f in AllFiles
tell application "Pages"
set my_file to (f as POSIX file)
set my_name to name of (info for my_file)
set doc to open my_file
export doc as formatted text to alias (my_path & ".rtfd" as POSIX file)
close doc
end tell
end repeat
Do you know what I am doing wrong here?
Hmm. This is a tough one. My best guess is that Pages doesn't like f defined as type "POSIX file".
I came across a post online that seems to deal with batch conversion https://macvector.com/blog/2014/05/using-applescript-to-batch-convert-files/ and this is the script they put out:
-- Batch convert all MacVector files in a folder into Genbank format in a second folder.
-- Clindley@MacVector.com
-- v0.4
-- 2 April 2014
-- 2 April 2014 added routine to ignore any file other than MacVector NA files.
-- 15 May 2014 concatenate all GB files into a single one. display summary dialog and open MV before starting the conversion
--Add original filename to locus
set inputFolder to (choose folder with prompt "Select Folder of MV files to convert:")
set outputFolder to (choose folder with prompt "Select Folder to save Genbank files") as text -- we need this as text to manipulate later
set inputCount to 0
set outputCount to 0
-- decide whether Fasta or Genbank
display dialog "Do you want Fasta or Genbank output?" buttons {"fasta", "genbank"}
-- now define some variables about the file format.
if the button returned of the result is "fasta" then
set SeqFormat to "fasta"
set defaultAnswer to "AllFastaFiles.fa"
set fileExtension to "fa"
else
set SeqFormat to "genbank"
set defaultAnswer to "AllFastaFiles.gb"
set fileExtension to "gb"
end if
display dialog "Do you want a single multiple sequence " & SeqFormat & " file containing all " & SeqFormat & " sequences in " & outputFolder & "?" buttons {"yes", "no"}
if the button returned of the result is "yes" then
set concatenate to 1
display dialog "Please enter the Output filename (do not use spaces):" default answer defaultAnswer
set AllGenbankFilename to text returned of result
else
set concatenate to 0
end if
tell application "Finder"
set AllFiles to every file of folder inputFolder
end tell
tell application "MacVector.app"
--open MV to avoid delays opening files if MV is not already open
activate
end tell
repeat with f in AllFiles
--create the output filepath and add a suitable file extension
tell application "Finder"
--get file extension of the file
set mvExtension to the name extension of f
-- get file path as posix path
set inputFilePath to name of f
set AppleScript's text item delimiters to "."
set outputFilePathBits to text items of inputFilePath
set last text item of outputFilePathBits to fileExtension
set outputFileName to outputFilePathBits as text
--now grab the filename without the extension as the locus variable
set last text item of outputFilePathBits to ""
set Locus to outputFilePathBits as text
--now strip the last character which is the dot remaining from the file extension
set Locuslength to (count of characters of Locus)
set Locus to text 1 thru (Locuslength - 1) of Locus
--Now reduce it to 16 characters which is the limit of the Locus field in a Genbank record
if length of Locus is greater than 16 then
set Locus to text 1 thru 16 of Locus
end if
--now remove any spaces from the locus name
set AppleScript's text item delimiters to " "
set temp to text items of Locus
-- display dialog temp
set AppleScript's text item delimiters to "_"
set Locus to text items of temp as text
--display dialog Locus
set outputFilePath to outputFolder & outputFileName as text
set inputCount to inputCount + 1 -- increment the number of files tested
end tell
if mvExtension = "nucl" then
tell application "MacVector.app"
--Now open the file
open f
delay 0.3 -- wait a little bit until MV has opened the file
--now save it as a genbank file
set docRef to (a reference to the first document)
if SeqFormat = "fasta" then
save docRef in outputFilePath as «constant savfFASA»
else
save docRef in outputFilePath as «constant savfGENB»
end if
close docRef
end tell
if concatenate = 1 then
-- now concatenate the files
--first we need a UNIX style path for using "cat"
set UNIXinputFilePath to quoted form of POSIX path of outputFilePath
set UNIXoutputFolder to outputFolder & AllGenbankFilename
set UNIXoutputFolder to quoted form of POSIX path of UNIXoutputFolder
--First let's modify the LOCUS to reflect the old filename. The following perl one liner will modify just the first line of the perl script
do shell script "perl -pi -e 'substr($_, 12, 16)= \"" & Locus & "\" if $. <= 1' " & UNIXinputFilePath
--display dialog "cat " & UNIXoutputFilePath & " >>" & UNIXoutputFolder
do shell script "cat " & UNIXinputFilePath & " >>" & UNIXoutputFolder
end if
set outputCount to outputCount + 1 -- increment the number of files tested
end if
end repeat
-- convert the count variables to strings so we can display them in the dialogue
set inputCount to inputCount as string
set outputCount to outputCount as string
if concatenate = 1 then
display dialog inputCount & " files checked and " & outputCount & " files converted to " & SeqFormat & " format. All the files were saved to " & UNIXoutputFolder
else
display dialog inputCount & " files checked and " & outputCount & " files converted to " & SeqFormat & " format."
end if
I was thinking I could possibly adjust the code here to suite my purposes, just not sure whether this is the answer...
I have a possible work around where finder get items of (choose folder) and then set my_paths to {items}, however AppleScript is saying that 'it can't get all the items'.
tell application "Finder" to get items of (choose folder)
set my_paths to {items}
repeat with my_path in my_paths
tell application "Pages"
set my_file to (my_path as POSIX file)
set my_name to name of (info for my_file)
set doc to open my_file
export doc as formatted text to alias (my_path & ".rtfd" as POSIX file)
close doc
end tell
end repeat
any ideas...
I am making headway: I just managed to get it to work when choosing the file:
tell application "Finder" to get POSIX path of (choose file)
get POSIX path of result
set my_paths to {result}
repeat with my_path in my_paths
tell application "Pages"
set my_file to (my_path as POSIX file)
set my_name to name of (info for my_file)
set doc to open my_file
export doc as formatted text to alias (my_path & ".rtfd" as POSIX file)
close doc
end tell
end repeat
Now I need to find a way to get all POSIX path of files in folder...
Nothing worse than debugging applescript...
Maybe something like this would work?
set folderChoice to POSIX path of (choose folder)
tell application "System Events"
set myPaths to POSIX path of disk items of folder folderChoice
end tell
repeat with myPath in myPaths
<existing working code>
end repeat
Your code worked..up to a point: Pages stopped running the script when it hit the menacing “.DS_Store”
I hard coded the folder and managed to do the same as your code: (with the same error)
tell application "System Events"
set TitleList to POSIX path of items of folder "/Users/.../Documents/.../Client 1/"
end tell
set my_paths to result
repeat with my_path in my_paths
tell application "Pages"
set my_file to (my_path as POSIX file)
set my_name to name of (info for my_file)
set doc to open my_file
export doc as formatted text to alias (my_path & ".rtfd" as POSIX file)
close doc
end tell
end repeat
I saw a post where someone suggested only showing visibleFiles but I don't know how to integrate this into the existing code..
set TitleList to POSIX path of "/Users/.../Client 1/"
tell application "System Events"
set allVisibleFiles to files of folder TitleList whose visible is true
end tell
The result excludes the .DS_Store file but the list is not POSIX, rather it is: {file "Macintosh HD:Users:..:Client 1:Test_3.doc" of application "System Events", file...etc.}
I GOT IT!!!! WOOOHOOO! So Stoked!
set folderChoice to POSIX path of (choose folder)
tell application "System Events"
set allVisibleFiles to POSIX path of disk items of folder folderChoice whose visible is true
end tell
repeat with my_path in allVisibleFiles
tell application "Pages"
set my_file to (my_path as POSIX file)
set my_name to name of (info for my_file)
set doc to open my_file
export doc as formatted text to alias (my_path & ".rtfd" as POSIX file)
close doc
end tell
end repeat
Thank you David for your valuable input! Doing cartwheels in my living room LOL!!!
Congrats! Nice hacking!
Hello David,
thank you for this insight. I have searched high and low to batch convert .doc files to .rtf via terminal while keeping images and tables in the document. I need this to work in order to access and automatically parse particular data in an .rtf file to a MacOS SwiftUI app.
I came across your script that I hope will be reversible, i.e. from .doc to .rtf via pages. I also used formatted text as I came across someone with the following statement: "constant defined in the dictionary for RTF documents is formatted text". Here is my code:
`
set my_paths to {"Users/.../Documents/Test_3.doc", "/Users/.../Documents/Test_4.doc"}
repeat with my_path in my_paths
tell application "Pages"
set my_file to (my_path as POSIX file)
set my_name to name of (info for my_file)
set doc to open my_file
export doc as formatted text to alias (my_path & ".rtf" as POSIX file)
close doc
tell application "Finder" to delete my_file
end tell
end repeat
`
I get an error: "The document "Test_3.doc" could not be opened. The file doesn't exist."
And the Apple Script Editor error reads: "Pages got an error: Can’t make missing value into type document."
Not sure what I am doing wrong...