Mozilla Rhino doesn’t use the facilities of
Bug with non capturing groups
Here is a variation of bug 369860 occurring in String.replace.
The regular expression used is really “real life” stuff as it is what the prototype library uses to filter scripts in the text of received XMLHttpRequest.
public void testBuggyReplace()
final Context ctx = Context.enter();
final ScriptableObject topScope = ctx.initStandardObjects();
final String text = "<b>bla</b><script>alert(123);</script>bla";
final String regex = "(?:<script.*?>)((\\n|\\r|.)*?)(?:<\\/script>)";
final String expected = "<b>bla</b>bla";
assertEquals(expected, text.replaceAll(regex, "")); // check replacement
topScope.put("str", topScope, regex);
topScope.put("text", topScope, text);
topScope.put("expected", topScope, expected);
final String script = "var re = new RegExp(str, 'img');\n"
+ "var s = text.replace(re, '');\n"
+ "if (s != expected)"
+ " throw 'Expected >' + expected + '' + s + '";
ctx.evaluateString(topScope, script, "test", 0, null);
The same regular expression used for larger texts show how slow Rhino’s RegExp support is.
Performing the replacement on the text from previous example repeated 100 times I get on my desktop:
Pure Rhino: 25 ms
String.replace using java.util.regex: 7 ms
and if I repeat it 1000 times this becomes even worse:
Pure Rhino: 440 ms
String.replace using java.util.regex: 15 ms
Quite impressive difference!
HtmlUnit’s first step to use
java.util.regex based JS RegExp
Ideally the Rhino RegExp support should be rewritten to use
java.util.regex. This will surely come in the future but it is not yet the case.
String functions that need regular expression don’t use the
RegExp functionalities directly but through a proxy that can be configured through
ScriptRuntime.setRegExpProxy. This is what