Community
    • Login

    Perl language syntax highlighting troubles (bug or limitation ?)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    112 Posts 6 Posters 44.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Gilles MaisonneuveG
      Gilles Maisonneuve
      last edited by

      OOPS, yours :== self.lexer_name, mine :== self_lexer_name, I am really a dumb when dealing with OO programming, can’t realize that ‘self’ is the current object and of course separated by a dot.

      Colour has changed for q* keywords and there text (black on dark blue, can’t read but now just need to ajust the colors).
      No change for here docs, but don’t know if I properly set the colors, have to check.

      Send you a screen copy in a few minutes.

      1 Reply Last reply Reply Quote 2
      • Gilles MaisonneuveG
        Gilles Maisonneuve @Ekopalypse
        last edited by Gilles Maisonneuve

        @Ekopalypse

        All right, nearly done: with the following regexp in your python code:

        regexes[(1, (255,0,128))] = (r'\bq[rwqx]{0,1}\b([^\h]).*?\1|(\bq[rwqx]{0,1}\b\h+(\w).*?\3)', [0])
        regexes[(2, (255,0,128))] = (r'\bq[rwqx]{0,1}\b\h*(\(.+?\)|\[.+?\]|\{.+?\})', [0])
        regexes[(3, (0,0,0))] = (r'(?s)((<<)"*(\w+?)"*;.*?\3)', [2])
        regexes[(4, (0,0,0))] = (r'(?s)((<<)\h+"(\w+?)";.*?\3)', [2,3])
        

        I get the following colors:

        q* colors OK, here docs no

        Q* colors are good {well I might have an uggly taste in colors but at least they match ;-)) }

        Would you have any clue about why the here docs= are still not handled properly ? They should be black, I think.

        EkopalypseE 1 Reply Last reply Reply Quote 2
        • EkopalypseE
          Ekopalypse @Gilles Maisonneuve
          last edited by

          @Gilles-Maisonneuve

          the regexes assumes double quotes and semicolon directly attached to EOT.
          Like

          print << "EOT";
          
          --------------------- separation line ------------------
          
          EOT
          

          Is there a rule how this is specified?

          1 Reply Last reply Reply Quote 2
          • Gilles MaisonneuveG
            Gilles Maisonneuve
            last edited by Gilles Maisonneuve

            I think I found why.
            Your regexp says :
            r'(?s)((<<)"*(\w+?)"*;.*?\3)'
            would not it be better if :
            r'(?s)(\h*(<<)\h*"*(\w+?)"*\h*;.*?\3)'

            ???

            To answer your question:

            Perl allows

            1. <<TEXT,
            2. << TEXT
            3. <<‘TEXT’ / << ‘TEXT’
            4. <<“TEXT” / << “TEXT”

            meanings differ in each case…

            1 Reply Last reply Reply Quote 2
            • EkopalypseE
              Ekopalypse
              last edited by

              To be honest - I’m not a regex expert at all :-D
              If you, as a perl developer, say so I would absolutely believe it is :-)

              Gilles MaisonneuveG 1 Reply Last reply Reply Quote 1
              • Gilles MaisonneuveG
                Gilles Maisonneuve @Ekopalypse
                last edited by

                @Ekopalypse

                In your Python regexp, what’s the meaning of:

                1. “\3”
                2. “, [2]” and “[2,3]” ?

                If I can understand what I think I could translate a Perl regex code into python (for this case at least).

                1 Reply Last reply Reply Quote 0
                • EkopalypseE
                  Ekopalypse
                  last edited by

                  What about using this
                  (?s)((<<)\h+(["|'])(\w+?)\3\h*;.*?\4)

                  Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                  • EkopalypseE
                    Ekopalypse
                    last edited by

                    1. is the boost:regex convention to denote match group 3
                      and
                    2. defines which match group actually should be painted

                    Like if you have:

                    r'(word1)(word2)(word3)', [2,3]
                    

                    would mean that only word2 and word3 would be painted
                    whereas if you would specify

                    r'(word1)(word2)(word3)', [0]
                    

                    everything would be colored.

                    Does this makes sense to you?

                    Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                    • Gilles MaisonneuveG
                      Gilles Maisonneuve @Ekopalypse
                      last edited by

                      @Ekopalypse

                      I don’t understand your regexp syntax. Perhaps too ‘pythonized’ for me.

                      (?s) : what does it mean ? is it ‘s///’ ? or really a non capturing group of ‘s’ ???
                      \3 \4 : are they $3 $4, I don’t think as I can’t see a 4th accumulator

                      1 Reply Last reply Reply Quote 0
                      • EkopalypseE
                        Ekopalypse
                        last edited by Ekopalypse

                        (?s) is a modifier telling the engine that the dot matches line endings
                        and yes, the engine uses \1 and $1

                        Here the link to the documentation - maybe easier for you.

                        1 Reply Last reply Reply Quote 0
                        • EkopalypseE
                          Ekopalypse
                          last edited by

                          ooppps

                          (?s)((<<)\h+(["|'])(\w+?)\3\h*;.*?\3)

                          :-D

                          Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                          • Gilles MaisonneuveG
                            Gilles Maisonneuve @Ekopalypse
                            last edited by

                            This post is deleted!
                            1 Reply Last reply Reply Quote 0
                            • Gilles MaisonneuveG
                              Gilles Maisonneuve @Ekopalypse
                              last edited by

                              @Ekopalypse

                              Ok
                              another one: in Python you must say ["|'] instead of Perl ["'] (‘either one of the set’) ? Is that what it means ?

                              EkopalypseE 1 Reply Last reply Reply Quote 0
                              • EkopalypseE
                                Ekopalypse
                                last edited by

                                No, afaik non-capturing group is (?:pattern)
                                This, (?s), just tells the engine that the dot . is matching
                                EOLs like \r\n - if I’m right.

                                1 Reply Last reply Reply Quote 1
                                • EkopalypseE
                                  Ekopalypse
                                  last edited by Ekopalypse

                                  Just for clarification, the python script does NOT use the python regex engine instead
                                  it uses the one notepad++ offers, the boost::regex.
                                  Yes, you can use the enumeration without the pipe but makes it more visible for me with
                                  the pipe sign. Or is there a difference if used with pipe sign or without?

                                  1 Reply Last reply Reply Quote 1
                                  • EkopalypseE
                                    Ekopalypse
                                    last edited by Ekopalypse

                                    or maybe this one might be even better
                                    (?s)(<<)\h+(["'])(\w+?)\2\h*;.*?\3

                                    1 Reply Last reply Reply Quote 1
                                    • Gilles MaisonneuveG
                                      Gilles Maisonneuve
                                      last edited by

                                      Can’t reply what I wanted, a robot says I’m spamming…

                                      Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                                      • EkopalypseE
                                        Ekopalypse @Gilles Maisonneuve
                                        last edited by

                                        @Gilles-Maisonneuve

                                        Can’t reply what I wanted, a robot says I’m spamming…

                                        I have no idea why this happens sometimes.

                                        By the way, now that you have installed pythonscript plugin would you mind
                                        clicking Plugins->Python Script->Scripts->Samples->RegexTester ?

                                        I know not everyone is recommending it but, personally, I love it.

                                        1 Reply Last reply Reply Quote 1
                                        • Gilles MaisonneuveG
                                          Gilles Maisonneuve @Gilles Maisonneuve
                                          last edited by

                                          AFAIK, at least in Perl, ["|'] means double-quote OR pipre OR simple-quote, everything between square brakets is literal. Also true in “awk” and C regexp I think.
                                          I don’t know for Python.

                                          Gilles MaisonneuveG 1 Reply Last reply Reply Quote 1
                                          • Gilles MaisonneuveG
                                            Gilles Maisonneuve @Gilles Maisonneuve
                                            last edited by

                                            @Ekopalypse

                                            Now, if I say in Pyhton (attempt to transliterate from Perl) :

                                            (r'(?s)(\h*(<<)\h*["|']?([^"|^']+?)["|']?\h*;.*?\3)', [2])
                                            

                                            does it mean :

                                            1. form REGEXP
                                            2. do not match NL with DOT
                                            3. matches any horizontal blanks (0 or more), don’t make a group
                                            4. matches ‘<<’ make it a group
                                            5. matches any horizontal blanks (0 or more), don’t make a group
                                            6. matches 0 or 1 text quote (either double or single), no group
                                            7. matches a group of any chars not " nor ’ one or more time(s) (in perl it would be [^"'])
                                            8. matches 0 or 1 text quote (either double or single), no group
                                            9. possible blanks until semi-colon, semi-colon, then possible chars until NL

                                            BUT THEN, what does mean ?\3. I’m lost there.

                                            Gilles MaisonneuveG 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors