Skip to content

Instantly share code, notes, and snippets.

@robjwells
Last active April 28, 2021 03:00
Show Gist options
  • Save robjwells/5032356 to your computer and use it in GitHub Desktop.
Save robjwells/5032356 to your computer and use it in GitHub Desktop.
BBEdit & TextWrangler text clean-up script for the Morning Star newspaper
-- By Rob Wells for the Morning Star
on open theStories
repeat with aStory in theStories
tell application "TextWrangler"
open aStory
tell the front text document
set encoding to "Unicode (UTF-8)"
educate quotes with replacing target
-- Remove hard-wraps (not perfect)
my grepRep("([-,—:;[:alnum:]]) *\\r *([-,—:;[:alnum:]])", "\\1 \\2")
-- Important thing to note is that it only finds non-terminal punctuation.
-- It hits properly formatted bylines, but the byline replace fixes it later.
my grepRep(" {2,}", " ") -- Multiple spaces to single space
my grepRep("^ ", "") -- Remove spaces at the start of lines
my grepRep("\\t+", "") -- Remove tabs
-- Superscript numbers to quote marks
my litRep("¹", "’") -- Apostrophe
my litRep("²", "”") -- Right double
my litRep("³", "“") -- Left double
my grepRep(" [–-] ", " — ") -- En-dashes & hyphens to em-dashes
my grepRep("^• *|^n ", "n") -- Blob-pars (nHeady heady)
my grepRep("([.0-9]*) *(%|percent)(?!age)", "\\1 per cent") -- "per cent"
my litRep("...", "…") -- Ellipses
-- Break byline after name and ensure lower-case 'b'
-- Check for (rare) 3-word bylines, else handle 2-word bylines
my grepRep("^(by Our (?:Foreign|News|Sports) Desk|by Morning Star Reporter)(?:[ ,]*)(.*)\\r+", "\\l\\1\\r\\2\\r")
if the result is 0 then -- 3-word byline not found
-- Break two-word byline
my grepRep("^(by [-[:alpha:]]+ [-[:alpha:]]+)(?:[ ,]*)(.*)\\r+", "\\l\\1\\r\\2\\r")
end if
my grepRep("\\r+\\z", "") -- Delete empty lines at end
my grepRep("”([[:punct:]])", "\\1”") -- Transpose rdquo with punctuation.
end tell
end tell
end repeat
end open
-- Convenience wrapper functions
on litRep(searchString, replaceString) -- Literal search and replace
tell application "TextWrangler"
tell text 1 of text window 1
replace searchString using replaceString options {search mode:literal, starting at top:true}
end tell
end tell
end litRep
on grepRep(searchString, replaceString) -- Grep search and replace
tell application "TextWrangler"
tell text 1 of text window 1
replace searchString using replaceString options {search mode:grep, starting at top:true}
end tell
end tell
end grepRep
@robjwells
Copy link
Author

This script is designed to be saved as a droplet that copy editors drag text files onto before working on them.

Since it’s almost entirely literal and regex replaces, it is trivially easy to update it when you spot new problems cropping up in copy.

If you don’t want to get your hands dirty with AppleScript, the same thing can be made with a BBEdit text factory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment