Community
    • Login

    Perl language syntax highlighting troubles (bug or limitation ?)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    112 Posts 6 Posters 44.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Gilles MaisonneuveG
      Gilles Maisonneuve
      last edited by Gilles Maisonneuve

      I think I found why.
      Your regexp says :
      r'(?s)((<<)"*(\w+?)"*;.*?\3)'
      would not it be better if :
      r'(?s)(\h*(<<)\h*"*(\w+?)"*\h*;.*?\3)'

      ???

      To answer your question:

      Perl allows

      1. <<TEXT,
      2. << TEXT
      3. <<‘TEXT’ / << ‘TEXT’
      4. <<“TEXT” / << “TEXT”

      meanings differ in each case…

      1 Reply Last reply Reply Quote 2
      • EkopalypseE
        Ekopalypse
        last edited by

        To be honest - I’m not a regex expert at all :-D
        If you, as a perl developer, say so I would absolutely believe it is :-)

        Gilles MaisonneuveG 1 Reply Last reply Reply Quote 1
        • Gilles MaisonneuveG
          Gilles Maisonneuve @Ekopalypse
          last edited by

          @Ekopalypse

          In your Python regexp, what’s the meaning of:

          1. “\3”
          2. “, [2]” and “[2,3]” ?

          If I can understand what I think I could translate a Perl regex code into python (for this case at least).

          1 Reply Last reply Reply Quote 0
          • EkopalypseE
            Ekopalypse
            last edited by

            What about using this
            (?s)((<<)\h+(["|'])(\w+?)\3\h*;.*?\4)

            Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
            • EkopalypseE
              Ekopalypse
              last edited by

              1. is the boost:regex convention to denote match group 3
                and
              2. defines which match group actually should be painted

              Like if you have:

              r'(word1)(word2)(word3)', [2,3]
              

              would mean that only word2 and word3 would be painted
              whereas if you would specify

              r'(word1)(word2)(word3)', [0]
              

              everything would be colored.

              Does this makes sense to you?

              Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
              • Gilles MaisonneuveG
                Gilles Maisonneuve @Ekopalypse
                last edited by

                @Ekopalypse

                I don’t understand your regexp syntax. Perhaps too ‘pythonized’ for me.

                (?s) : what does it mean ? is it ‘s///’ ? or really a non capturing group of ‘s’ ???
                \3 \4 : are they $3 $4, I don’t think as I can’t see a 4th accumulator

                1 Reply Last reply Reply Quote 0
                • EkopalypseE
                  Ekopalypse
                  last edited by Ekopalypse

                  (?s) is a modifier telling the engine that the dot matches line endings
                  and yes, the engine uses \1 and $1

                  Here the link to the documentation - maybe easier for you.

                  1 Reply Last reply Reply Quote 0
                  • EkopalypseE
                    Ekopalypse
                    last edited by

                    ooppps

                    (?s)((<<)\h+(["|'])(\w+?)\3\h*;.*?\3)

                    :-D

                    Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                    • Gilles MaisonneuveG
                      Gilles Maisonneuve @Ekopalypse
                      last edited by

                      This post is deleted!
                      1 Reply Last reply Reply Quote 0
                      • Gilles MaisonneuveG
                        Gilles Maisonneuve @Ekopalypse
                        last edited by

                        @Ekopalypse

                        Ok
                        another one: in Python you must say ["|'] instead of Perl ["'] (‘either one of the set’) ? Is that what it means ?

                        EkopalypseE 1 Reply Last reply Reply Quote 0
                        • EkopalypseE
                          Ekopalypse
                          last edited by

                          No, afaik non-capturing group is (?:pattern)
                          This, (?s), just tells the engine that the dot . is matching
                          EOLs like \r\n - if I’m right.

                          1 Reply Last reply Reply Quote 1
                          • EkopalypseE
                            Ekopalypse
                            last edited by Ekopalypse

                            Just for clarification, the python script does NOT use the python regex engine instead
                            it uses the one notepad++ offers, the boost::regex.
                            Yes, you can use the enumeration without the pipe but makes it more visible for me with
                            the pipe sign. Or is there a difference if used with pipe sign or without?

                            1 Reply Last reply Reply Quote 1
                            • EkopalypseE
                              Ekopalypse
                              last edited by Ekopalypse

                              or maybe this one might be even better
                              (?s)(<<)\h+(["'])(\w+?)\2\h*;.*?\3

                              1 Reply Last reply Reply Quote 1
                              • Gilles MaisonneuveG
                                Gilles Maisonneuve
                                last edited by

                                Can’t reply what I wanted, a robot says I’m spamming…

                                Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                                • EkopalypseE
                                  Ekopalypse @Gilles Maisonneuve
                                  last edited by

                                  @Gilles-Maisonneuve

                                  Can’t reply what I wanted, a robot says I’m spamming…

                                  I have no idea why this happens sometimes.

                                  By the way, now that you have installed pythonscript plugin would you mind
                                  clicking Plugins->Python Script->Scripts->Samples->RegexTester ?

                                  I know not everyone is recommending it but, personally, I love it.

                                  1 Reply Last reply Reply Quote 1
                                  • Gilles MaisonneuveG
                                    Gilles Maisonneuve @Gilles Maisonneuve
                                    last edited by

                                    AFAIK, at least in Perl, ["|'] means double-quote OR pipre OR simple-quote, everything between square brakets is literal. Also true in “awk” and C regexp I think.
                                    I don’t know for Python.

                                    Gilles MaisonneuveG 1 Reply Last reply Reply Quote 1
                                    • Gilles MaisonneuveG
                                      Gilles Maisonneuve @Gilles Maisonneuve
                                      last edited by

                                      @Ekopalypse

                                      Now, if I say in Pyhton (attempt to transliterate from Perl) :

                                      (r'(?s)(\h*(<<)\h*["|']?([^"|^']+?)["|']?\h*;.*?\3)', [2])
                                      

                                      does it mean :

                                      1. form REGEXP
                                      2. do not match NL with DOT
                                      3. matches any horizontal blanks (0 or more), don’t make a group
                                      4. matches ‘<<’ make it a group
                                      5. matches any horizontal blanks (0 or more), don’t make a group
                                      6. matches 0 or 1 text quote (either double or single), no group
                                      7. matches a group of any chars not " nor ’ one or more time(s) (in perl it would be [^"'])
                                      8. matches 0 or 1 text quote (either double or single), no group
                                      9. possible blanks until semi-colon, semi-colon, then possible chars until NL

                                      BUT THEN, what does mean ?\3. I’m lost there.

                                      Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                                      • Gilles MaisonneuveG
                                        Gilles Maisonneuve @Gilles Maisonneuve
                                        last edited by

                                        a slash m

                                        EkopalypseE 1 Reply Last reply Reply Quote 0
                                        • EkopalypseE
                                          Ekopalypse
                                          last edited by Ekopalypse

                                          the r at the beginning just informs python that this is a raw string and
                                          every char must be taken literally otherwise backslashes would be treated
                                          as escapes under some circumstances.

                                          The regex string is only this part

                                          (?s)(\h*(<<)\h*["|']?([^"|^']+?)["|']?\h*;.*?\3)
                                          

                                          and I would say, but as said - not an regex expert at all,

                                          (?s) means Dot matches newline characters
                                          the first matching group is

                                          (\h*(<<)\h*["|']?([^"|^']+?)["|']?\h*;.*?\3)
                                          

                                          the second

                                          (<<)
                                          

                                          and the third must be

                                          ([^"|^']+?)
                                          

                                          if I’m right.

                                          \3 should be the same as $3 in perl

                                          Gilles MaisonneuveG 1 Reply Last reply Reply Quote 1
                                          • Gilles MaisonneuveG
                                            Gilles Maisonneuve @Ekopalypse
                                            last edited by Gilles Maisonneuve

                                            @Ekopalypse

                                            still confused: ([^"|^']+?) why a ‘?’ after the ‘+’ what’s for this ‘?’

                                            and then \3 would mean the 3rd matching group (third ‘()’) but in Perl is used only in subsitutions. What is the use here ? There are only 2 groups in the regex (two blocks surrounded by parenthèses only.

                                            EkopalypseE 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors