• Login
Community
  • Login

Perl language syntax highlighting troubles (bug or limitation ?)

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
112 Posts 6 Posters 44.9k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • E
    Ekopalypse
    last edited by Mar 19, 2019, 10:42 PM

    What about using this
    (?s)((<<)\h+(["|'])(\w+?)\3\h*;.*?\4)

    G 1 Reply Last reply Mar 19, 2019, 10:48 PM Reply Quote 0
    • E
      Ekopalypse
      last edited by Mar 19, 2019, 10:48 PM

      1. is the boost:regex convention to denote match group 3
        and
      2. defines which match group actually should be painted

      Like if you have:

      r'(word1)(word2)(word3)', [2,3]
      

      would mean that only word2 and word3 would be painted
      whereas if you would specify

      r'(word1)(word2)(word3)', [0]
      

      everything would be colored.

      Does this makes sense to you?

      G 1 Reply Last reply Mar 19, 2019, 10:55 PM Reply Quote 0
      • G
        Gilles Maisonneuve @Ekopalypse
        last edited by Mar 19, 2019, 10:48 PM

        @Ekopalypse

        I don’t understand your regexp syntax. Perhaps too ‘pythonized’ for me.

        (?s) : what does it mean ? is it ‘s///’ ? or really a non capturing group of ‘s’ ???
        \3 \4 : are they $3 $4, I don’t think as I can’t see a 4th accumulator

        1 Reply Last reply Reply Quote 0
        • E
          Ekopalypse
          last edited by Ekopalypse Mar 19, 2019, 10:52 PM Mar 19, 2019, 10:51 PM

          (?s) is a modifier telling the engine that the dot matches line endings
          and yes, the engine uses \1 and $1

          Here the link to the documentation - maybe easier for you.

          1 Reply Last reply Reply Quote 0
          • E
            Ekopalypse
            last edited by Mar 19, 2019, 10:55 PM

            ooppps

            (?s)((<<)\h+(["|'])(\w+?)\3\h*;.*?\3)

            :-D

            G 1 Reply Last reply Mar 19, 2019, 10:58 PM Reply Quote 0
            • G
              Gilles Maisonneuve @Ekopalypse
              last edited by Mar 19, 2019, 10:55 PM

              This post is deleted!
              1 Reply Last reply Reply Quote 0
              • G
                Gilles Maisonneuve @Ekopalypse
                last edited by Mar 19, 2019, 10:58 PM

                @Ekopalypse

                Ok
                another one: in Python you must say ["|'] instead of Perl ["'] (‘either one of the set’) ? Is that what it means ?

                E 1 Reply Last reply Mar 19, 2019, 11:19 PM Reply Quote 0
                • E
                  Ekopalypse
                  last edited by Mar 19, 2019, 10:59 PM

                  No, afaik non-capturing group is (?:pattern)
                  This, (?s), just tells the engine that the dot . is matching
                  EOLs like \r\n - if I’m right.

                  1 Reply Last reply Reply Quote 1
                  • E
                    Ekopalypse
                    last edited by Ekopalypse Mar 19, 2019, 11:02 PM Mar 19, 2019, 11:01 PM

                    Just for clarification, the python script does NOT use the python regex engine instead
                    it uses the one notepad++ offers, the boost::regex.
                    Yes, you can use the enumeration without the pipe but makes it more visible for me with
                    the pipe sign. Or is there a difference if used with pipe sign or without?

                    1 Reply Last reply Reply Quote 1
                    • E
                      Ekopalypse
                      last edited by Ekopalypse Mar 19, 2019, 11:15 PM Mar 19, 2019, 11:15 PM

                      or maybe this one might be even better
                      (?s)(<<)\h+(["'])(\w+?)\2\h*;.*?\3

                      1 Reply Last reply Reply Quote 1
                      • G
                        Gilles Maisonneuve
                        last edited by Mar 19, 2019, 11:17 PM

                        Can’t reply what I wanted, a robot says I’m spamming…

                        G 1 Reply Last reply Mar 19, 2019, 11:22 PM Reply Quote 0
                        • E
                          Ekopalypse @Gilles Maisonneuve
                          last edited by Mar 19, 2019, 11:19 PM

                          @Gilles-Maisonneuve

                          Can’t reply what I wanted, a robot says I’m spamming…

                          I have no idea why this happens sometimes.

                          By the way, now that you have installed pythonscript plugin would you mind
                          clicking Plugins->Python Script->Scripts->Samples->RegexTester ?

                          I know not everyone is recommending it but, personally, I love it.

                          1 Reply Last reply Reply Quote 1
                          • G
                            Gilles Maisonneuve @Gilles Maisonneuve
                            last edited by Mar 19, 2019, 11:22 PM

                            AFAIK, at least in Perl, ["|'] means double-quote OR pipre OR simple-quote, everything between square brakets is literal. Also true in “awk” and C regexp I think.
                            I don’t know for Python.

                            G 1 Reply Last reply Mar 19, 2019, 11:23 PM Reply Quote 1
                            • G
                              Gilles Maisonneuve @Gilles Maisonneuve
                              last edited by Mar 19, 2019, 11:23 PM

                              @Ekopalypse

                              Now, if I say in Pyhton (attempt to transliterate from Perl) :

                              (r'(?s)(\h*(<<)\h*["|']?([^"|^']+?)["|']?\h*;.*?\3)', [2])
                              

                              does it mean :

                              1. form REGEXP
                              2. do not match NL with DOT
                              3. matches any horizontal blanks (0 or more), don’t make a group
                              4. matches ‘<<’ make it a group
                              5. matches any horizontal blanks (0 or more), don’t make a group
                              6. matches 0 or 1 text quote (either double or single), no group
                              7. matches a group of any chars not " nor ’ one or more time(s) (in perl it would be [^"'])
                              8. matches 0 or 1 text quote (either double or single), no group
                              9. possible blanks until semi-colon, semi-colon, then possible chars until NL

                              BUT THEN, what does mean ?\3. I’m lost there.

                              G 1 Reply Last reply Mar 19, 2019, 11:27 PM Reply Quote 0
                              • G
                                Gilles Maisonneuve @Gilles Maisonneuve
                                last edited by Mar 19, 2019, 11:27 PM

                                a slash m

                                E 1 Reply Last reply Mar 19, 2019, 11:44 PM Reply Quote 0
                                • E
                                  Ekopalypse
                                  last edited by Ekopalypse Mar 19, 2019, 11:36 PM Mar 19, 2019, 11:36 PM

                                  the r at the beginning just informs python that this is a raw string and
                                  every char must be taken literally otherwise backslashes would be treated
                                  as escapes under some circumstances.

                                  The regex string is only this part

                                  (?s)(\h*(<<)\h*["|']?([^"|^']+?)["|']?\h*;.*?\3)
                                  

                                  and I would say, but as said - not an regex expert at all,

                                  (?s) means Dot matches newline characters
                                  the first matching group is

                                  (\h*(<<)\h*["|']?([^"|^']+?)["|']?\h*;.*?\3)
                                  

                                  the second

                                  (<<)
                                  

                                  and the third must be

                                  ([^"|^']+?)
                                  

                                  if I’m right.

                                  \3 should be the same as $3 in perl

                                  G 1 Reply Last reply Mar 19, 2019, 11:43 PM Reply Quote 1
                                  • G
                                    Gilles Maisonneuve @Ekopalypse
                                    last edited by Gilles Maisonneuve Mar 19, 2019, 11:44 PM Mar 19, 2019, 11:43 PM

                                    @Ekopalypse

                                    still confused: ([^"|^']+?) why a ‘?’ after the ‘+’ what’s for this ‘?’

                                    and then \3 would mean the 3rd matching group (third ‘()’) but in Perl is used only in subsitutions. What is the use here ? There are only 2 groups in the regex (two blocks surrounded by parenthèses only.

                                    E 1 Reply Last reply Mar 19, 2019, 11:47 PM Reply Quote 0
                                    • E
                                      Ekopalypse @Gilles Maisonneuve
                                      last edited by Mar 19, 2019, 11:44 PM

                                      @Gilles-Maisonneuve

                                      maybe this picture makes it a little bit clearer

                                      1 Reply Last reply Reply Quote 2
                                      • E
                                        Ekopalypse @Gilles Maisonneuve
                                        last edited by Ekopalypse Mar 19, 2019, 11:49 PM Mar 19, 2019, 11:47 PM

                                        @Gilles-Maisonneuve

                                        still confused: ([^"|^']+?) why a ‘?’ after the ‘+’ what’s for this ‘?’

                                        as less as possible - non-greedy

                                        and then \3 would mean the 3rd matching group (third ‘()’) but in Perl is used only in >subsitutions. What is the use here ? There are only 2 groups in the regex (two blocks >surrounded by parenthèses only.

                                        placeholder for what was found in match group 3, to find the EOT at the end

                                        and there are 3 match groups or am I missing something??

                                        G 1 Reply Last reply Mar 20, 2019, 12:03 AM Reply Quote 1
                                        • G
                                          Gilles Maisonneuve @Ekopalypse
                                          last edited by Mar 20, 2019, 12:03 AM

                                          @Ekopalypse

                                          2 sets of parenteses only, where is the third set ?
                                          so only 2 match groups

                                          can you make this work :

                                          no syntax error on the python console but absolutely no result, where is my bug ?

                                          regexes[(3, (255,255,255))] = (r'(?s)(\s*(<<)\s*("{0,1}.+"{0,1})\s*;.*?\3)', [1])
                                          
                                          E 1 Reply Last reply Mar 20, 2019, 12:19 AM Reply Quote 0
                                          61 out of 112
                                          • First post
                                            61/112
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors