Skip to content

Instantly share code, notes, and snippets.

@slevithan
Created March 16, 2012 01:28
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save slevithan/2048056 to your computer and use it in GitHub Desktop.
Save slevithan/2048056 to your computer and use it in GitHub Desktop.
Cross-Browser Split
/*!
* Cross-Browser Split 1.1.1
* Copyright 2007-2012 Steven Levithan <stevenlevithan.com>
* Available under the MIT License
* ECMAScript compliant, uniform cross-browser split method
*/
/**
* Splits a string into an array of strings using a regex or string separator. Matches of the
* separator are not included in the result array. However, if `separator` is a regex that contains
* capturing groups, backreferences are spliced into the result each time `separator` is matched.
* Fixes browser bugs compared to the native `String.prototype.split` and can be used reliably
* cross-browser.
* @param {String} str String to split.
* @param {RegExp|String} separator Regex or string to use for separating the string.
* @param {Number} [limit] Maximum number of items to include in the result array.
* @returns {Array} Array of substrings.
* @example
*
* // Basic use
* split('a b c d', ' ');
* // -> ['a', 'b', 'c', 'd']
*
* // With limit
* split('a b c d', ' ', 2);
* // -> ['a', 'b']
*
* // Backreferences in result array
* split('..word1 word2..', /([a-z]+)(\d+)/i);
* // -> ['..', 'word', '1', ' ', 'word', '2', '..']
*/
var split;
// Avoid running twice; that would break the `nativeSplit` reference
split = split || function (undef) {
var nativeSplit = String.prototype.split,
compliantExecNpcg = /()??/.exec("")[1] === undef, // NPCG: nonparticipating capturing group
self;
self = function (str, separator, limit) {
// If `separator` is not a regex, use `nativeSplit`
if (Object.prototype.toString.call(separator) !== "[object RegExp]") {
return nativeSplit.call(str, separator, limit);
}
var output = [],
flags = (separator.ignoreCase ? "i" : "") +
(separator.multiline ? "m" : "") +
(separator.extended ? "x" : "") + // Proposed for ES6
(separator.sticky ? "y" : ""), // Firefox 3+
lastLastIndex = 0,
// Make `global` and avoid `lastIndex` issues by working with a copy
separator = new RegExp(separator.source, flags + "g"),
separator2, match, lastIndex, lastLength;
str += ""; // Type-convert
if (!compliantExecNpcg) {
// Doesn't need flags gy, but they don't hurt
separator2 = new RegExp("^" + separator.source + "$(?!\\s)", flags);
}
/* Values for `limit`, per the spec:
* If undefined: 4294967295 // Math.pow(2, 32) - 1
* If 0, Infinity, or NaN: 0
* If positive number: limit = Math.floor(limit); if (limit > 4294967295) limit -= 4294967296;
* If negative number: 4294967296 - Math.floor(Math.abs(limit))
* If other: Type-convert, then use the above rules
*/
limit = limit === undef ?
-1 >>> 0 : // Math.pow(2, 32) - 1
limit >>> 0; // ToUint32(limit)
while (match = separator.exec(str)) {
// `separator.lastIndex` is not reliable cross-browser
lastIndex = match.index + match[0].length;
if (lastIndex > lastLastIndex) {
output.push(str.slice(lastLastIndex, match.index));
// Fix browsers whose `exec` methods don't consistently return `undefined` for
// nonparticipating capturing groups
if (!compliantExecNpcg && match.length > 1) {
match[0].replace(separator2, function () {
for (var i = 1; i < arguments.length - 2; i++) {
if (arguments[i] === undef) {
match[i] = undef;
}
}
});
}
if (match.length > 1 && match.index < str.length) {
Array.prototype.push.apply(output, match.slice(1));
}
lastLength = match[0].length;
lastLastIndex = lastIndex;
if (output.length >= limit) {
break;
}
}
if (separator.lastIndex === match.index) {
separator.lastIndex++; // Avoid an infinite loop
}
}
if (lastLastIndex === str.length) {
if (lastLength || !separator.test("")) {
output.push("");
}
} else {
output.push(str.slice(lastLastIndex));
}
return output.length > limit ? output.slice(0, limit) : output;
};
// For convenience
String.prototype.split = function (separator, limit) {
return self(this, separator, limit);
};
return self;
}();
@jdalton
Copy link

jdalton commented Mar 16, 2012

Here are a few feature tests I've used related to split issues (my shim was based on your earlier work):

  var STRING_SPLIT_RETURNS_UNDEFINED_VALUES_AS_STRINGS = (function() {
    // true for Firefox (the original comment just said Firefox so I am assuming at least < 4.0)
    var result = 'oxo'.split(/x(y)?/);
    return result.length == 3 && typeof result[1] == 'string';
  }());

  // for Chrome <= 8
  var STRING_SPLIT_ZERO_LENGTH_MATCH_RETURNS_NON_EMPTY_ARRAY = !!''.split(/^/).length;

   // true for IE (the original comment just said IE, so I am assuming at least < 9)
  var STRING_SPLIT_BUGGY_WITH_REGEXP = 'x'.split(/x/).length != 2 || 'oxo'.split(/x(y)?/).length != 3;

@slevithan
Copy link
Author

slevithan commented Apr 3, 2012

Thanks for sharing, @jdalton.

Side note: It's interesting to compare this code to the shorter but equivalent implementation in XRegExp, to see how XRegExp's scaffolding simplifies things and smooths over cross-browser bugs, etc.

@slevithan
Copy link
Author

@Yaffle posted an alternate implementation here in response to this. Note that most of the difference in apparent size is due to the lack of comments and shorter variable names. I haven't tested it (in particular, it should be tested in older browsers), but it's a good source for comparison since the implementation is significantly different. It doesn't defer to the native String.prototype.split when the separator is not a RegExp object, so it will be slower in such cases.

@slevithan
Copy link
Author

This code was originally posted on my blog at http://blog.stevenlevithan.com/archives/cross-browser-split.

@junaruga
Copy link

junaruga commented Aug 4, 2016

Hi,
I would like to upload a patch to load String#split if its String#split is older than ES5.
Because now below project is using your split.js with own modification.
https://github.com/lautis/uglifier/blob/master/lib/split.js

Its default behavior is same with current one.
User can select loading mode with SPLIT_LOAD_FORCE flag.

Could you patch this?
If there is a problem, could you tell me?

Thanks.

diff --git split.js split.js
index 43d13a2..6a8cddd 100644
--- split.js
+++ split.js
@@ -29,6 +29,7 @@
  * split('..word1 word2..', /([a-z]+)(\d+)/i);
  * // -> ['..', 'word', '1', ' ', 'word', '2', '..']
  */
+SPLIT_LOAD_FORCE = 1
 var split;

 // Avoid running twice; that would break the `nativeSplit` reference
@@ -107,9 +108,11 @@ split = split || function (undef) {
     };

     // For convenience
-    String.prototype.split = function (separator, limit) {
-        return self(this, separator, limit);
-    };
+    if (SPLIT_LOAD_FORCE || "\n".split(/\n/).length == 0) {
+        String.prototype.split = function (separator, limit) {
+            return self(this, separator, limit);
+        };
+    }

     return self;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment