UDL comment that requires seperators
-
I have a language that uses
/*
and*/
to open and close comments; however only when not “glued” to anything else, ie. the comments require separators.
Is there any way to implement this, perhaps using several delimiters? Or alternatively is there an implementation that would specifically highlight broken comments for fixing? -
@G-A ,
Sorry, unlike the “Operators” and “Folders” categories (which have variants for “separators required” and not), the “Comments” and “Delimiters” categories (which are the two categories most often used for code-comments) do not have a separators-required variant or option.
The Enhance Any Lexer allows you to provide regular expressions (regex) that will add custom foreground coloring. So with that, you could either make a regex that will match a valid comment and turn it green; and if you wanted, one or more separate regex to also turn invalid regex red:
[MyUDLName] ; color each word, 0x66ad1 is the color used, see the description above for more information on the color coding. 0x00FF00 = (?-s)(^|\s+)/\*\s+.+\s+\*/(\s+|$) 0x0000FF = (?-s)(?:[^\s\A])\K/\*.*?\*/ 0x0000CC = (?-s)/\*\S.+\*/ 0xCC00CC = (?-s)/\*.+\S\*/ 0xFF00FF = (?-s)/\*.+\*/\S ; check in the respective styler xml if the following IDs are valid excluded_styles = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,20,21,22,23
Rather than introducing lots of alternations or getting clever to make the “red/invalid”, I just did a separate regex for each of the four borders that could be missing a space/separator, to make it more readable. I also used four different red/purple colors, so you could tell which regex was indicating each of the bad comment types that I interpreted from your description.
Here’s what it looked like for me with the above definition:
using “code”:/* should be ok at beginning of file */ ValidCode /* Valid Comment */ /* also ok */ ValidCode /*InValid Comment */ /*also wrong */ ValidCode /* InValid Comment*/ /* wrong at end*/ ValidCode/* InValid Comment */ /* wrong after end */BecauseNoSpaceBeforeHere /* also ok at EOF */
-----
update: to not highlight theB
inBecauseNoSpaceBeforeHere
, change the last regex to0xFF00FF = (?-s)/\*.+\*/(?=\S)
If you aren’t a regex expert:
-
@PeterJones Thank you for your answer and example. Seems to work pretty well.
Not being regex expert, what would I need to do so that multi-line comments are supported too? -
@G-A said in UDL comment that requires seperators:
Not being regex expert, what would I need to do so that multi-line comments are supported too?
The
(?-s)
turns off the dot-matches-newline feature of regex; for multi-line comments to work, you’d need to turn it on instead, with(?s)
at the beginning of each of the regexes shown.edit: but if you do that, you
mightneed to make all the.
matches “non-greedy”(not tested):0x00FF00 = (?s)(^|\s+)/\*\s+.+?\s+\*/(\s+|$) 0x0000FF = (?s)(?:[^\s\A])\K/\*.*?\*/ 0x0000CC = (?s)/\*\S.+?\*/ 0xCC00CC = (?s)/\*.+?\S\*/ 0xFF00FF = (?s)/\*.+?\*/\S
edit 2:
… Hmm … when I don’t make it non-greedy, it doesn’t work; but if I make it greedy, it doesn’t work either.After some experimenting, replace the
.+?
or.*?
with((?!\*/).)+?
or((?!\*/).)*?
– that way, it matches any character unless it’s the start of the comment-ending*/
0x00FF00 = (?s)(^|\s+)/\*\s+((?!\*/).)*?\s+\*/(\s+|$) 0x0000FF = (?s)(?:[^\s\A])\K/\*((?!\*/).)*?\*/ 0x0000CC = (?s)/\*\S((?!\*/).)*?\*/ 0xCC00CC = (?s)/\*((?!\*/).)*?\S\*/ 0xFF00FF = (?s)/\*((?!\*/).)*?\*/(?=\S)
-
@PeterJones said in UDL comment that requires seperators:
you might need to make all the . matches “non-greedy” (not tested)
DEFINITELY!
Here’s the difference (regexes are my own, for illustrative purposes; colors are also not intended to match OP desire – again, my own, to illustrate):
Greedy
(?s)/\*.+\*/
has single match (the yellow) on whole thing:Non-greedy
(?s)/\*.+?\*/
matches (shown in alternating colors): -
@Alan-Kilborn said,
DEFINITELY!
Yes, I discovered that when I started my testing after the first edit. I started the testing when I realized that probably non-greedy alone wasn’t sufficient, either. Hence the second edit, with the tested regex and screenshot
-
Since we’re still debating, here’s what I came up with:
[foobar] ; valid comments are green 0x229f22 = (?<!\S)/\*(?:(?!\*/)[\s\S])*\*/(?!\S) ;invalid comments (no space before or after) are orange 0x0088ff = \S/\*(?:(?!\*/)[\s\S])*\*/\S ; invalid comments (no space before) are red 0x0000ff = \S/\*(?:(?!\*/)[\s\S])*\*/(?!\S) ; invalid comments (no space after) are purple 0xff00ff = (?<!\S)/\*(?:(?!\*/)[\s\S])*\*/\S ; check in the respective styler xml if the following IDs are valid excluded_styles = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,20,21,22,23
text is:
/*start of file is ok*/ foo /* bar */baz /* no sep after, bad */ foo/* bar */ baz /* no sep before, bad */ foo /* bar */ baz /* both seps, good */ foo /**/ baz /* both seps, empty, good */ foo /**/baz /* no sep after, empty, bad */ foo/**/ baz /* no sep before, empty, bad */ foo/**/baz /* no sep before or after, bad */ /* you need EnhanceAnyLexer plugin to see pretty colors */ /* */ro /*eof is ok*/
While we’re at it, find/replace
(\S)?(/\*(?:(?!\*/)[\s\S])*\*/)(\S)?
with(?1\1\x20)\2(?3\x20\3)
will fix all errors. -
@Mark-Olson said in UDL comment that requires seperators:
Since we’re still debating, here’s what I came up with:
I guess it depends on the OP’s language, whether the comment begin/end only need spaces outside the comment (like you interpreted), or whether they also need spaces inside the comment (like I interpreted).
Why do you use
[\s\S]
instead of.
for “match any character”? My guess is that you want to be able to not think about .-matches-newline … But it’s longer ((?s)
and.
is only 5 characters, compared to[\s\S]
6 characters), and (IMO) it doesn’t convey the meaning as well, because when looking at a.
in a regex, I immediately see “any character”, whereas if I see[\s\S]
I have to think to myself “match any character that’s either a space character or isn’t a space chararacter – gee, why not use a.
instead?” every time I try to read that idiom. -
In my specific case, the
/*
or*/
need to be separated both inside and outside the comment from any other characters in order to function as a comment.In any case, appreciate all the responses. This has been very helpful.
-
I finally came up with a regex-replace that adds buffer spaces internally as well as externally, if you are interested.
Replace
(?s)(\S)?/\*((?!\*/)\S)?((?:(?!\S?\*/).)+?)?((?!\*/)\S)?\*/(\S)?
with(?1\1\x20)/*(?2\x20\2)(?3\3:\x20)(?4\4\x20)*/(?5\x20\5)
Just for the record, a lot of the complexity of that regex-replace is due to dealing with the silly corner case of replacing
/**/
with/* */
. You could absolutely come up with something simpler if you didn’t care about that.