-
Notifications
You must be signed in to change notification settings - Fork 59
Fix regexp not removing multi-line comments #88
base: master
Are you sure you want to change the base?
Conversation
The previous regular expression had two issues: 1. It was only able to remove 1-line comments. 2. It was not set to be ungreedy, and could potentially remove content between two comments. E.g. when something like `<!-- comment --> content <!-- comment -->` does not contain a newline character, the content would actually be removed. We run into this issue at https://gerrit.wikimedia.org/r/489323. As a temporary workaround we made all our comments 1-line comments, and made sure each comment is on a separate line.
|
||
// SVG XML -> HTML5 | ||
[/\<([A-Za-z]+)([^\>]*)\/\>/g, "<$1$2></$1>"], // convert self-closing XML SVG nodes to explicitly closed HTML5 SVG nodes | ||
[/\<([a-z]+)([^\>]*)\/\>/gi, "<$1$2></$1>"], // convert self-closing XML SVG nodes to explicitly closed HTML5 SVG nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case anyone else is a little rusty on their RegExps, this is a readability improvement: drop A-Z
and make a-z
case-insensitive with the i
flag.
@@ -12,10 +12,10 @@ var regexSequences = [ | |||
// Remove XML stuffs and comments | |||
[/<\?xml[\s\S]*?>/gi, ""], | |||
[/<!doctype[\s\S]*?>/gi, ""], | |||
[/<!--.*-->/gi, ""], | |||
[/<!--[\s\S]*?-->/g, ""], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, not sure if anyone else is rusty but .
does not match newlines. Adding newlines can be done with (.|\n)
but that adds a capturing group so [\s\S]
is used instead, which matches all whitespace and all non-whitespace characters which together form the set of all characters. No casing specifier is needed so that's dropped. *?
is like *
but non-greedy.
The previous regular expression had two issues:
<!-- comment --> content <!-- comment -->
does not contain a newline character, the content would actually be removed.We run into this issue at https://gerrit.wikimedia.org/r/489323. As a temporary workaround we made all our comments 1-line comments, and made sure each comment is on a separate line.