Skip to content

Instantly share code, notes, and snippets.

@fitnr
Last active December 18, 2015 19:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fitnr/5834901 to your computer and use it in GitHub Desktop.
Save fitnr/5834901 to your computer and use it in GitHub Desktop.
A tool for formalizing street names. Given as a Sublime Text 2 plugin and a javascript function

Converts US- and Canada-style street names into their formal representation. For instance:

  • Main St. => Main Street
  • 9th Av => Ninth Avenue
  • W47 St => West 47th Street

Sublime Text 2 Package

Works for one or more selections. For multi-line selections, it treats every line as a separate street name. See installation instructions in the comments.

Javascript

The Javascript function will convert one address at a time. It might be useful when combined with the Address Splitter. There are some tests in a separate file.

Caveats

Numbered avenues, drives, boulevards and roads are given spelled out ordinals (Seventh Avenue) up to Twelfth. Streets, places, lanes and other types are given numeric names with an ordinal suffix (7th Street).

Not every potential, nonstandard street type abbreviation is understood, e.g.: Riverside Boulvd => Riverside Boulvd

A trailing directional letter is treated as part of the street name, as with Park Avenue South in Manhattan. This may be incorrect for some addresses where the trailing directional is a quadrant indicator, with the custom that the street should be written Park Avenue S.

While suffixes and prefixes (e.g. "North", "Avenue") will be correctly capitalized, the capitalization of actual street won't be changed. It's difficult to know what to do with da Vinci St. or DeKalb Ave.

function formalizeStreet(street) {
var streetPatterns = [
// Full Street Extension
// See below for Street, Drive and Lane
[/\baly\b\.?/gi, 'Alley'],
[/\bave?\b\.?/gi, 'Avenue'],
[/\bfr?wy\b\.?/gi, 'Freeway'],
[/\brd\b\.?/gi, 'Road'],
[/\bbl(vd)?\b\.?/gi, 'Boulevard'],
[/\bte?rr?\b\.?/gi, 'Terrace'],
[/\bpk\b\.?/gi, 'Park'],
[/\bplz\b\.?/gi, 'Plaza'],
[/\bpl\b\.?/gi, 'Place'],
[/\bln\b\.?/gi, 'Lane'],
[/\bexp?w?y\b\.?/gi, 'Expressway'],
[/\bpkw?y\b\.?/gi, 'Parkway'],
[/\btr?n?p(ke)?\b\.?/gi, 'Turnpike'],
[/\bhwy?\b\.?/gi, 'Highway'],
[/\bcir\b\.?/gi, 'Circle'],
[/\bsq\b\.?/gi, 'Square'],
[/\bbrg?\b\.?/gi, 'Bridge'],
[/\bs\.?\/?r\.?/gi, 'Service Road'],
[/\bwy\b\.?/gi, 'Way'],
// Directional Suffix
// Not to match if the only thing before is "Avenue"
// browsers treat this pattern differently: /(^foo)? bar/
// (safari and ff do it wrong, chrome does it right)
// the rewrite with a simple match of avenue
[/(Avenue)? N\b[.,]?/gi, function(m, p) { return p ? m : ' North'; }],
[/(Avenue)? S\b[.,]?/gi, function(m, p) { return p ? m : ' South'; }],
[/(Avenue)? E\b[.,]?/gi, function(m, p) { return p ? m : ' East'; }],
[/(Avenue)? W\b[.,]?/gi, function(m, p) { return p ? m : ' West'; }],
// Workaround for above bug is to add rules for when we actually have "Park Avenue S"
// The extra period excludes "Avenue N"
[/(. Avenue) N$/gi, function(m, p) { return p ? p + ' North' : m; }],
[/(. Avenue) S$/gi, function(m, p) { return p ? p + ' South' : m; }],
[/(. Avenue) E$/gi, function(m, p) { return p ? p + ' East' : m; }],
[/(. Avenue) W$/gi, function(m, p) { return p ? p + ' West' : m; }],
// Special rules for street and drive, which are also abbrvs for honorifics
// And La, which may occur in names (e.g. La Guardia Place)
// Only replaced if at the end of the input, or followed by a directional suffix
// e.g. "Example Dr N." => "Example Drive North"
[/\bst\b\.?($| North| South| West| East| NW| NE| SE| SW)/gi, 'Street$1'],
[/\bdr\b\.?($| North| South| West| East| NW| NE| SE| SW)/gi, 'Drive$1'],
[/\bla\b\.?($| North| South| West| East| NW| NE| SE| SW)/gi, 'Lane$1'],
// Directional prefix without a space for numbered streets
// Sadly common in NYC, e.g. W42 St
[/^N\.?(?=\d)/gi, 'North '],
[/^S\.?(?=\d)/gi, 'South '],
[/^E\.?(?=\d)/gi, 'East '],
[/^W\.?(?=\d)/gi, 'West '],
// Directional prefix with leading space
// Washington DC exception e.g "N Street"
[/^N\b[\.,]?(?! Str)/gi, 'North'],
[/^S\b[\.,]?(?! Str)/gi, 'South'],
[/^E\b[\.,]?(?! Str)/gi, 'East'],
[/^W\b[\.,]?(?! Str)/gi, 'West'],
// Spelling out avenues, drives, roads and boulevards
[/\b1(st)?(?= (Aven|Driv|Road|Boul))/gi, 'First'],
[/\b2(nd)?(?= (Aven|Driv|Road|Boul))/gi, 'Second'],
[/\b3(rd)?(?= (Aven|Driv|Road|Boul))/gi, 'Third'],
[/\b4(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Fourth'],
[/\b5(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Fifth'],
[/\b6(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Sixth'],
[/\b7(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Seventh'],
[/\b8(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Eighth'],
[/\b9(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Ninth'],
[/\b10(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Tenth'],
[/\b11(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Eleventh'],
[/\b12(th)?(?= (Aven|Driv|Road|Boul))/gi, 'Twelfth'],
// Adding numeral suffixes to streets and places
[/(11|12|13)?\b/gi, function(m, p) { return p ? m + 'th' : m; }],
[/(1)?\b/gi, function(m, p) { return p ? m + 'st' : m; }],
[/(2)?\b/gi, function(m, p) { return p ? m + 'nd' : m; }],
[/(3)?\b/gi, function(m, p) { return p ? m + 'rd' : m; }],
[/([0456789])?\b/gi, function(m, p) { return p ? m + 'th' : m; }],
[/Ext\b[.,]?$/gi, 'Extension']
];
for (var i in streetPatterns) {
street = street.replace(streetPatterns[i][0], streetPatterns[i][1]);
}
return street;
}
import sublime
import sublime_plugin
import re
"""
Add this file to your Sublime Text 2 User folder, [...]/Sublime Text 2/Packages/User.
To use in the command palette, add the following line to the user.sublime-commands file in the same folder:
{ "caption": "Formalize Street Names", "command": "formalize_street_names" }
"""
street_patterns = [
# Full Street Extension
# See below for Street and Drive and Lane
(r'(?i)\baly\b\.?', 'Alley'),
(r'(?i)\bave?\b\.?', 'Avenue'),
(r'(?i)\brd\b\.?', 'Road'),
(r'(?i)\bbl(vd)?\b\.?', 'Boulevard'),
(r'(?i)\bte?rr?\b\.?', 'Terrace'),
(r'(?i)\bplz\b\.?', 'Plaza'),
(r'(?i)\bpl\b\.?', 'Place'),
(r'(?i)\bpk\b\.?', 'Park'),
(r'(?i)\bln\b\.?', 'Lane'),
(r'(?i)\bexp?w?y\b\.?', 'Expressway'),
(r'(?i)\bfr?wy\b\.?', 'Freeway'),
(r'(?i)\bpkw?y\b\.?', 'Parkway'),
(r'(?i)\btr?n?p(ke)?\b\.?', 'Turnpike'),
(r'(?i)\bhwy?\b\.?', 'Highway'),
(r'(?i)\bcir\b\.?', 'Circle'),
(r'(?i)\bsq\b\.?', 'Square'),
(r'(?i)\bbrg?\b\.?', 'Bridge'),
(r'(?i)\bs/?r\b\.?', 'Service Road'),
(r'(?i)\bwy\b\.?', 'Way'),
# Directional Suffix
# Not to match if the only thing before is "Avenue"
(r'(?i)(?<!^Avenue) N\b[\.,]?', ' North'),
(r'(?i)(?<!^Avenue) S\b[\.,]?', ' South'),
(r'(?i)(?<!^Avenue) E\b[\.,]?', ' East'),
(r'(?i)(?<!^Avenue) W\b[\.,]?', ' West'),
# Special rules for street and drive, which are also abbrvs for honorifics
# And La, which may occur in names (e.g. La Guardia Place)
# Only replaced if at the end of the input, or followed by a directional suffix
# e.g. "Example Dr N." => "Example Drive North"
(r'(?i)\bst\b\.?($| North| South| West| East| NW| NE| SE| SW)', r'Street\1'),
(r'(?i)\bdr\b\.?($| North| South| West| East| NW| NE| SE| SW)', r'Drive\1'),
(r'(?i)\bla\b\.?($| North| South| West| East| NW| NE| SE| SW)', r'Lane\1'),
# Directional prefix without a space for numbered streets
# Sadly common in NYC, e.g. W42 St
(r'(?i)^N\.?(?=\d)', 'North '),
(r'(?i)^S\.?(?=\d)', 'South '),
(r'(?i)^E\.?(?=\d)', 'East '),
(r'(?i)^W\.?(?=\d)', 'West '),
# Directional prefix
# Washington DC exception e.g "N Street"
(r'(?i)^N\b[\.,]?(?! Str)', 'North'),
(r'(?i)^S\b[\.,]?(?! Str)', 'South'),
(r'(?i)^E\b[\.,]?(?! Str)', 'East'),
(r'(?i)^W\b[\.,]?(?! Str)', 'West'),
# Spelling out avenues, drives, roads and boulevards
(r'(?i)\b1(st)?(?= (Aven|Driv|Road|Boul))', 'First'),
(r'(?i)\b2(nd)?(?= (Aven|Driv|Road|Boul))', 'Second'),
(r'(?i)\b3(rd)?(?= (Aven|Driv|Road|Boul))', 'Third'),
(r'(?i)\b4(th)?(?= (Aven|Driv|Road|Boul))', 'Fourth'),
(r'(?i)\b5(th)?(?= (Aven|Driv|Road|Boul))', 'Fifth'),
(r'(?i)\b6(th)?(?= (Aven|Driv|Road|Boul))', 'Sixth'),
(r'(?i)\b7(th)?(?= (Aven|Driv|Road|Boul))', 'Seventh'),
(r'(?i)\b8(th)?(?= (Aven|Driv|Road|Boul))', 'Eighth'),
(r'(?i)\b9(th)?(?= (Aven|Driv|Road|Boul))', 'Ninth'),
(r'(?i)\b10(th)?(?= (Aven|Driv|Road|Boul))', 'Tenth'),
(r'(?i)\b11(th)?(?= (Aven|Driv|Road|Boul))', 'Eleventh'),
(r'(?i)\b12(th)?(?= (Aven|Driv|Road|Boul))', 'Twelfth'),
# Adding numeral suffixes to streets and places
(r'(?<=(11|12|13))(?= (Str|Ter|Lan|Pla|Ave|Dri|Roa|Bou))', 'th'),
(r'(?<=1)(?= (Str|Ter|Lan|Pla|Ave|Dri|Roa|Bou))', 'st'),
(r'(?<=2)(?= (Str|Ter|Lan|Pla|Ave|Dri|Roa|Bou))', 'nd'),
(r'(?<=3)(?= (Str|Ter|Lan|Pla|Ave|Dri|Roa|Bou))', 'rd'),
(r'(?<=[0456789])(?= (Str|Ter|Ave|Roa|Dri|Bou|Pla|Ter|Lan))', 'th'),
(r'Ext\b[.,]?$', 'Extension')
]
def formalize(street):
for pattern, replace in street_patterns:
street = re.sub(pattern, replace, street) # flags=re.MULTILINE | re.IGNORECASE)
return street
class FormalizeStreetNamesCommand(sublime_plugin.TextCommand):
'formalize street names'
def run(self, edit):
# select all
if self.view.sel()[0].empty():
self.view.sel().add(sublime.Region(0, self.view.size()))
for sel in self.view.sel():
newlines = []
# Split of the text of the selection
lines = self.view.substr(sel).split("\n")
for line in lines:
newlines.append(formalize(line))
text = "\n".join(newlines)
self.view.replace(edit, sel, text)
print "Street Name Formalizer looked at", len(self.view.sel()), "selections"
var cases = [
["Main St", "Main Street"],
["Main St.", "Main Street"],
['Ave N', 'Avenue N'],
['Dr. Martin Luther King, Jr. Dr', 'Dr. Martin Luther King, Jr. Drive'],
['45 St', '45th Street'],
['W 89 Street', 'West 89th Street'],
['W. 89 Street', 'West 89th Street'],
['9 Av', 'Ninth Avenue'],
['9 Blvd', 'Ninth Boulevard'],
['9 Rd', 'Ninth Road'],
['8th Rd', 'Eighth Road'],
['13th Rd', '13th Road'],
['9 Dr', 'Ninth Drive'],
['N 12 St', 'North 12th Street'],
['N. 12 St', 'North 12th Street'],
['St. Johns St.', 'St. Johns Street'],
['Example St. N', 'Example Street North'],
['Main Dr. S', 'Main Drive South'],
['W42 St', 'West 42nd Street'],
['N Ten Eyck Blvd', 'North Ten Eyck Boulevard'],
['Union Square N', 'Union Square North'],
['Washington Sq W', 'Washington Square West'],
['6 Av S', 'Sixth Avenue South'],
['Ave N', 'Avenue N'],
['Park Ave S', 'Park Avenue South'],
['Flatbush Ave Ext', 'Flatbush Avenue Extension'],
['N St. NW', 'N Street NW'],
['Main Blvd', 'Main Boulevard'],
['Butternut Ridge Rd', 'Butternut Ridge Road'],
['Main Expy S/R', 'Main Expressway Service Road'],
['Main Frwy sr', 'Main Freeway Service Road'],
['Lincoln Fwy', 'Lincoln Freeway'],
['Main Hwy', 'Main Highway'],
['Main Hwy N.', 'Main Highway North'],
['1 St', '1st Street'],
['2 St', '2nd Street'],
['3 St', '3rd Street'],
['13 St', '13th Street'],
['229 St', '229th Street'],
['229 Dr', '229th Drive'],
['3 Dr', 'Third Drive'],
['11 Dr', 'Eleventh Drive'],
['Elm Cir', 'Elm Circle'],
['1 La N', '1st Lane North'],
['Elm Terr', 'Elm Terrace'],
['Brooklyn Br', 'Brooklyn Bridge'],
['Brooklyn Brg', 'Brooklyn Bridge'],
['Elm La', 'Elm Lane'],
['9 Pl', '9th Place'],
['Plum 2 St', 'Plum 2nd Street'],
['La Guardia Pl', 'La Guardia Place'],
['12 Av', 'Twelfth Avenue'],
['13 Av', '13th Avenue'],
['Washington Pk', 'Washington Park'],
['Elfreths Aly', 'Elfreths Alley'],
['Ave of the Americas', 'Avenue of the Americas']
];
function tester(fn, tests) {
var output, passed=0, failed=0;
for (var i in tests) {
output = fn(tests[i][0]);
if (output == tests[i][1]) {
passed++;
//console.log('passed', tests[i][0]);
} else {
failed++;
console.log('failed', tests[i][0], '=>', output);
}
}
console.log('passed', passed);
console.log('failed', failed);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment