• Login
Community
  • Login

Pythonscript search different than N++ search when using \< and leading uppercase

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
25 Posts 5 Posters 13.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    MAPJe71
    last edited by Feb 9, 2017, 10:39 PM

    Wondering if it’s only an issue with editor.research( ... ), as I have been using modifiers with editor.rereplace( ... ) successfully.

    C 1 Reply Last reply Feb 9, 2017, 10:48 PM Reply Quote 0
    • C
      Claudia Frank @MAPJe71
      last edited by Claudia Frank Feb 9, 2017, 10:49 PM Feb 9, 2017, 10:48 PM

      @MAPJe71

      Yep - same issue.

      New doc with the word Print only

      editor.rereplace('(?i)\<print\>', 'PRINT')
      

      fails where

      editor.rereplace('\<print\>', 'PRINT', 2)
      

      works

      UUhhh - I should avoid magic numbers - 2=re.I

      Cheers
      Claudia

      1 Reply Last reply Reply Quote 0
      • M
        MAPJe71
        last edited by Feb 10, 2017, 12:41 AM

        UUhhh - I should avoid magic numbers - 2=re.I

        Damn right!!! LOL

        1 Reply Last reply Reply Quote 0
        • G
          guy038
          last edited by guy038 Feb 11, 2017, 7:27 PM Feb 11, 2017, 7:09 PM

          Hi, All,

          I did some tests, using the classical Find dialog, with the original text, below :

          xyz
          xYz
          xyZ
          xYZ
          Xyz
          XYz
          XyZ
          XYZ
          -----
          1xyz
          1xYz
          1xyZ
          1xYZ
          1Xyz
          1XYz
          1XyZ
          1XYZ
          -----
          xyz9
          xYz9
          xyZ9
          xYZ9
          Xyz9
          XYz9
          XyZ9
          XYZ9
          -----
          1xyz9
          1xYz9
          1xyZ9
          1xYZ9
          1Xyz9
          1XYz9
          1XyZ9
          1XYZ9
          

          Then I tested the different regexes

          • xYZ

          • xYZ\>

          • xYZ\b

          • \<xYZ

          • \bxYZ

          • \<xYZ\>

          • \bxYZ\b

          • (^|(?<=\W))xYZ((?=\W)|$)

          • With the Match case option ON

          • With the Match case option OFF

          • Preceded by (?i) and with the Match case option ON

          • Preceded by (?i) and with the Match case option OFF

          • Preceded by (?-i) and with the Match case option ON

          • Preceded by (?-i) and with the Match case option OFF

          I obtained the six following tables, where :

          • A correct match is indicated by a * character

          • An incorrect match is indicated by the E letter ( Error )


          +-------+------------------------------------------------------------------------------------+
          |       |            Option "Match case"  ON           and           Regex, below            |
          | Text  |-----+-------+-------+-------+-------+---------+---------+--------------------------|
          |       | xYZ | xYZ\> | xYZ\b | \<xYZ | \bxYZ | \<xYZ\> | \bxYZ\b | (^|(?<=\W))xYZ((?=\W)|$) |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | xyz   |     |       |       |       |       |         |         |                          |
          | xYz   |     |       |       |       |       |         |         |                          |
          | xyZ   |     |       |       |       |       |         |         |                          |
          | xYZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | Xyz   |     |       |       |       |       |         |         |                          |
          | XYz   |     |       |       |       |       |         |         |                          |
          | XyZ   |     |       |       |       |       |         |         |                          |
          | XYZ   |     |       |       |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | 1xyz  |     |       |       |       |       |         |         |                          |
          | 1xYz  |     |       |       |       |       |         |         |                          |
          | 1xyZ  |     |       |       |       |       |         |         |                          |
          | 1xYZ  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1Xyz  |     |       |       |       |       |         |         |                          |
          | 1XYz  |     |       |       |       |       |         |         |                          |
          | 1XyZ  |     |       |       |       |       |         |         |                          |
          | 1XYZ  |     |       |       |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | xyz9  |     |       |       |       |       |         |         |                          |
          | xYz9  |     |       |       |       |       |         |         |                          |
          | xyZ9  |     |       |       |       |       |         |         |                          |
          | xYZ9  |  *  |       |       |   *   |   *   |         |         |                          |
          | Xyz9  |     |       |       |       |       |         |         |                          |
          | XYz9  |     |       |       |       |       |         |         |                          |
          | XyZ9  |     |       |       |       |       |         |         |                          |
          | XYZ9  |     |       |       |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | 1xyz9 |     |       |       |       |       |         |         |                          |
          | 1xYz9 |     |       |       |       |       |         |         |                          |
          | 1xyZ9 |     |       |       |       |       |         |         |                          |
          | 1xYZ9 |  *  |       |       |       |       |         |         |                          |
          | 1Xyz9 |     |       |       |       |       |         |         |                          |
          | 1XYz9 |     |       |       |       |       |         |         |                          |
          | 1XyZ9 |     |       |       |       |       |         |         |                          |
          | 1XYZ9 |     |       |       |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          
          
          +-------+------------------------------------------------------------------------------------+
          |       |            Option "Match case"  OFF          and           Regex, below            |
          | Text  |-----+-------+-------+-------+-------+---------+---------+--------------------------|
          |       | xYZ | xYZ\> | xYZ\b | \<xYZ | \bxYZ | \<xYZ\> | \bxYZ\b | (^|(?<=\W))xYZ((?=\W)|$) |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | xyz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | xYz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | xyZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | xYZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | Xyz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | XYz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | XyZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | XYZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | 1xyz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1xYz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1xyZ  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1xYZ  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1Xyz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1XYz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1XyZ  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1XYZ  |  *  |   *   |   *   |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | xyz9  |  *  |       |       |   *   |   *   |         |         |                          |
          | xYz9  |  *  |       |       |   *   |   *   |         |         |                          |
          | xyZ9  |  *  |       |       |   *   |   *   |         |         |                          |
          | xYZ9  |  *  |       |       |   *   |   *   |         |         |                          |
          | Xyz9  |  *  |       |       |   *   |   *   |         |         |                          |
          | XYz9  |  *  |       |       |   *   |   *   |         |         |                          |
          | XyZ9  |  *  |       |       |   *   |   *   |         |         |                          |
          | XYZ9  |  *  |       |       |   *   |   *   |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | 1xyz9 |  *  |       |       |       |       |         |         |                          |
          | 1xYz9 |  *  |       |       |       |       |         |         |                          |
          | 1xyZ9 |  *  |       |       |       |       |         |         |                          |
          | 1xYZ9 |  *  |       |       |       |       |         |         |                          |
          | 1Xyz9 |  *  |       |       |       |       |         |         |                          |
          | 1XYz9 |  *  |       |       |       |       |         |         |                          |
          | 1XyZ9 |  *  |       |       |       |       |         |         |                          |
          | 1XYZ9 |  *  |       |       |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          
          
          +-------+------------------------------------------------------------------------------------+
          |       |       Option "Match case"  ON       and       Regex, below, PRECEDED by (?i)       |
          | Text  |-----+-------+-------+-------+-------+---------+---------+--------------------------|
          |       | xYZ | xYZ\> | xYZ\b | \<xYZ | \bxYZ | \<xYZ\> | \bxYZ\b | (^|(?<=\W))xYZ((?=\W)|$) |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | xyz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | xYz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | xyZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | xYZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
          | Xyz   |  *  |   *   |   *   |   E   |   *   |    E    |    *    |            *             |
          | XYz   |  *  |   *   |   *   |   E   |   *   |    E    |    *    |            *             |
          | XyZ   |  *  |   *   |   *   |   E   |   *   |    E    |    *    |            *             |
          | XYZ   |  *  |   *   |   *   |   E   |   *   |    E    |    *    |            *             |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | 1xyz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1xYz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1xyZ  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1xYZ  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1Xyz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1XYz  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1XyZ  |  *  |   *   |   *   |       |       |         |         |                          |
          | 1XYZ  |  *  |   *   |   *   |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | xyz9  |  *  |       |       |   *   |   *   |         |         |                          |
          | xYz9  |  *  |       |       |   *   |   *   |         |         |                          |
          | xyZ9  |  *  |       |       |   *   |   *   |         |         |                          |
          | xYZ9  |  *  |       |       |   *   |   *   |         |         |                          |
          | Xyz9  |  *  |       |       |   E   |   *   |         |         |                          |
          | XYz9  |  *  |       |       |   E   |   *   |         |         |                          |
          | XyZ9  |  *  |       |       |   E   |   *   |         |         |                          |
          | XYZ9  |  *  |       |       |   E   |   *   |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          | 1xyz9 |  *  |       |       |       |       |         |         |                          |
          | 1xYz9 |  *  |       |       |       |       |         |         |                          |
          | 1xyZ9 |  *  |       |       |       |       |         |         |                          |
          | 1xYZ9 |  *  |       |       |       |       |         |         |                          |
          | 1Xyz9 |  *  |       |       |       |       |         |         |                          |
          | 1XYz9 |  *  |       |       |       |       |         |         |                          |
          | 1XyZ9 |  *  |       |       |       |       |         |         |                          |
          | 1XYZ9 |  *  |       |       |       |       |         |         |                          |
          +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
          
          
          
          1 Reply Last reply Reply Quote 1
          • G
            guy038
            last edited by guy038 Feb 11, 2017, 7:30 PM Feb 11, 2017, 7:21 PM

            I need to split this post in two parts because it exceeds 16384 characters !

            +-------+------------------------------------------------------------------------------------+
            |       |       Option "Match case"  OFF       and       Regex, below, PRECEDED by (?i)      |
            | Text  |-----+-------+-------+-------+-------+---------+---------+--------------------------|
            |       | xYZ | xYZ\> | xYZ\b | \<xYZ | \bxYZ | \<xYZ\> | \bxYZ\b | (^|(?<=\W))xYZ((?=\W)|$) |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | xyz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | xYz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | xyZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | xYZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | Xyz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | XYz   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | XyZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | XYZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | 1xyz  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1xYz  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1xyZ  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1xYZ  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1Xyz  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1XYz  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1XyZ  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1XYZ  |  *  |   *   |   *   |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | xyz9  |  *  |       |       |   *   |   *   |         |         |                          |
            | xYz9  |  *  |       |       |   *   |   *   |         |         |                          |
            | xyZ9  |  *  |       |       |   *   |   *   |         |         |                          |
            | xYZ9  |  *  |       |       |   *   |   *   |         |         |                          |
            | Xyz9  |  *  |       |       |   *   |   *   |         |         |                          |
            | XYz9  |  *  |       |       |   *   |   *   |         |         |                          |
            | XyZ9  |  *  |       |       |   *   |   *   |         |         |                          |
            | XYZ9  |  *  |       |       |   *   |   *   |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | 1xyz9 |  *  |       |       |       |       |         |         |                          |
            | 1xYz9 |  *  |       |       |       |       |         |         |                          |
            | 1xyZ9 |  *  |       |       |       |       |         |         |                          |
            | 1xYZ9 |  *  |       |       |       |       |         |         |                          |
            | 1Xyz9 |  *  |       |       |       |       |         |         |                          |
            | 1XYz9 |  *  |       |       |       |       |         |         |                          |
            | 1XyZ9 |  *  |       |       |       |       |         |         |                          |
            | 1XYZ9 |  *  |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            
            
            +-------+------------------------------------------------------------------------------------+
            |       |       Option "Match case"  ON       and       Regex, below, PRECEDED by (?-i)      |
            | Text  |-----+-------+-------+-------+-------+---------+---------+--------------------------|
            |       | xYZ | xYZ\> | xYZ\b | \<xYZ | \bxYZ | \<xYZ\> | \bxYZ\b | (^|(?<=\W))xYZ((?=\W)|$) |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | xyz   |     |       |       |       |       |         |         |                          |
            | xYz   |     |       |       |       |       |         |         |                          |
            | xyZ   |     |       |       |       |       |         |         |                          |
            | xYZ   |  *  |   *   |   *   |   *   |   *   |    *    |    *    |            *             |
            | Xyz   |     |       |       |       |       |         |         |                          |
            | XYz   |     |       |       |       |       |         |         |                          |
            | XyZ   |     |       |       |       |       |         |         |                          |
            | XYZ   |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | 1xyz  |     |       |       |       |       |         |         |                          |
            | 1xYz  |     |       |       |       |       |         |         |                          |
            | 1xyZ  |     |       |       |       |       |         |         |                          |
            | 1xYZ  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1Xyz  |     |       |       |       |       |         |         |                          |
            | 1XYz  |     |       |       |       |       |         |         |                          |
            | 1XyZ  |     |       |       |       |       |         |         |                          |
            | 1XYZ  |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | xyz9  |     |       |       |       |       |         |         |                          |
            | xYz9  |     |       |       |       |       |         |         |                          |
            | xyZ9  |     |       |       |       |       |         |         |                          |
            | xYZ9  |  *  |       |       |   *   |   *   |         |         |                          |
            | Xyz9  |     |       |       |       |       |         |         |                          |
            | XYz9  |     |       |       |       |       |         |         |                          |
            | XyZ9  |     |       |       |       |       |         |         |                          |
            | XYZ9  |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | 1xyz9 |     |       |       |       |       |         |         |                          |
            | 1xYz9 |     |       |       |       |       |         |         |                          |
            | 1xyZ9 |     |       |       |       |       |         |         |                          |
            | 1xYZ9 |  *  |       |       |       |       |         |         |                          |
            | 1Xyz9 |     |       |       |       |       |         |         |                          |
            | 1XYz9 |     |       |       |       |       |         |         |                          |
            | 1XyZ9 |     |       |       |       |       |         |         |                          |
            | 1XYZ9 |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            
            
            +-------+------------------------------------------------------------------------------------+
            |       |       Option "Match case"  OFF       and       Regex, below, PRECEDED by (?-i)     |
            | Text  |-----+-------+-------+-------+-------+---------+---------+--------------------------|
            |       | xYZ | xYZ\> | xYZ\b | \<xYZ | \bxYZ | \<xYZ\> | \bxYZ\b | (^|(?<=\W))xYZ((?=\W)|$) |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | xyz   |     |       |       |       |       |         |         |                          |
            | xYz   |     |       |       |       |       |         |         |                          |
            | xyZ   |     |       |       |       |       |         |         |                          |
            | xYZ   |  *  |   *   |   *   |   *   |       |    *    |    *    |            *             |
            | Xyz   |     |       |       |       |       |         |         |                          |
            | XYz   |     |       |       |       |       |         |         |                          |
            | XyZ   |     |       |       |       |       |         |         |                          |
            | XYZ   |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | 1xyz  |     |       |       |       |       |         |         |                          |
            | 1xYz  |     |       |       |       |       |         |         |                          |
            | 1xyZ  |     |       |       |       |       |         |         |                          |
            | 1xYZ  |  *  |   *   |   *   |       |       |         |         |                          |
            | 1Xyz  |     |       |       |       |       |         |         |                          |
            | 1XYz  |     |       |       |       |       |         |         |                          |
            | 1XyZ  |     |       |       |       |       |         |         |                          |
            | 1XYZ  |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | xyz9  |     |       |       |       |       |         |         |                          |
            | xYz9  |     |       |       |       |       |         |         |                          |
            | xyZ9  |     |       |       |       |       |         |         |                          |
            | xYZ9  |  *  |       |       |   *   |       |         |         |                          |
            | Xyz9  |     |       |       |       |       |         |         |                          |
            | XYz9  |     |       |       |       |       |         |         |                          |
            | XyZ9  |     |       |       |       |       |         |         |                          |
            | XYZ9  |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            | 1xyz9 |     |       |       |       |       |         |         |                          |
            | 1xYz9 |     |       |       |       |       |         |         |                          |
            | 1xyZ9 |     |       |       |       |       |         |         |                          |
            | 1xYZ9 |  *  |       |       |       |       |         |         |                          |
            | 1Xyz9 |     |       |       |       |       |         |         |                          |
            | 1XYz9 |     |       |       |       |       |         |         |                          |
            | 1XyZ9 |     |       |       |       |       |         |         |                          |
            | 1XYZ9 |     |       |       |       |       |         |         |                          |
            +-------+-----+-------+-------+-------+-------+---------+---------+--------------------------+
            

            From these results, we can deduce than the N++ Boost regex engine lacks to match 4 ( or 8 ) cases, ONLY IF the four conditions, below, occur, simultaneously :

            • The Match case option, of the Find dialog, is ON

            • A (?i) modifier starts the regex

            • The regex begins with the \< assertion

            • The text, to match, begins with an UPPER letter


            Luckily :

            • The regex \<xYZ can be changed by the regex \bxYZ

            • The regex \<xYZ\> can be changed by the regex \bxYZ\b

            And note that :

            • The assertion \< may be replaced by the assertion (^|(?<=\W))

            • The assertion \> may be replaced by the assertion ((?=\W)|$)

            Cheers

            guy038

            BTW, Claudia, when you said :

            I agree, I assumed too, that using the modifiers overrules the flags but with the regex
            you used we see it isn’t.

            I disagree ! Indeed :

            Considering the text below :

            xyz
            xYz
            xyZ
            xYZ
            Xyz
            XYz
            XyZ
            XYZ
            

            Of course, the two regexes (?i)\<xYZ and (?i)\<xYZ\>, with the Match case option ON, match the first four cases, only

            But, the two regexes (?i)\<XYZ and (?i)\<XYZ\>, with the Match case option ON, ALSO match the first four cases, only !

            So, I do think that the in-line modifiers (?i) and (?-i) have ALWAYS priority over the Match case option

            And, like you, I rather think that it’s just a bug [ in the implementation ] of the Boost regex engine !

            Besides, the two regexes (?i)\bxYZ and (?i)\bxYZ\b, with the Match case option ON, correctly match the eight cases, above, of the string “xyz” :-)) => The (?i) modifier forces the insensitive search !

            1 Reply Last reply Reply Quote 1
            • C
              Claudia Frank
              last edited by Feb 11, 2017, 10:21 PM

              Hi Guy,

              thx for your effort on this but I have to disagree with your disagree ;-D

              From regex execution point of view there are two ways to change
              the case behavior. Either by providing a flag or using the in-line modifiers.
              When providing the flag everything is ok (at least for the moment) - so
              I have to assume that the regex engine works correctly in this case.
              When providing the in-line modifier and the flags then it isn’t ok always.
              This does mean there is a bug and it must be related to how in-line modifiers
              are handled against flags. And that makes me think that the bug is when doing
              this overwrite - which means we can’t rely on it. Maybe other in-line modifiers
              together with some special regex constructs behave wrong as well.

              Cheers
              Claudia

              1 Reply Last reply Reply Quote 0
              • G
                guy038
                last edited by guy038 Feb 12, 2017, 9:09 AM Feb 12, 2017, 9:07 AM

                Hello, Claudia and All,

                Hum…,finally, Claudia, I think that you’re right :-) Indeed, if we built the general table, below, which recapitulates the main cases, it’s obvious that :

                • Results are OK, when the “Match case” flag, is ONLY used, WITHOUT any in-line modifier ( Lines 1 and 4 )

                • Results seem OK, ( UP TO NOW ), when the “Match case” flag is used, with a starting (?-i) in-line modifier ( Lines 3 et 6 )

                • Results seem OK, ( UP TO NOW ), when the “Match case” flag is OFF, with a starting (?i) in-line modifier ( Line 2 )

                • Results are NOT OK, when the “Match case” flag is ON, with a starting (?i) in-line modifier and a regex which begins with the \< assertion ( Line 5 )

                Luckily, this LAST case ( Line 5 ) is rather rare and does not occur if we use the \b syntax, instead of \< :-))

                +=======+=======================+====================+===========+==================+
                |  Row  |   "Match case" flag   |  In-line modifier  |  Results  |     Remarks      |
                +=======+=======================+====================+===========+==================+
                |   1   |          OFF          |         NO         |  Correct  |  Implicit (?i)   |
                +-------+-----------------------+--------------------+-----------+------------------+
                |   2   |          OFF          |        (?i)        |  Correct  |                  |
                +-------+-----------------------+--------------------+-----------+------------------+
                |   3   |          OFF          |        (?-i)       |  Correct  |                  |
                +=======+=======================+====================+===========+==================+
                |   4   |          ON           |         NO         |  Correct  |  Implicit (?-i)  |
                +-------+-----------------------+--------------------+-----------+------------------+
                |   5   |          ON           |        (?i)        |  PROBLEM  |  IF use of \<    |
                +-------+-----------------------+--------------------+-----------+------------------+
                |   6   |          ON           |        (?-i)       |  Correct  |                  |
                +=======+=======================+====================+===========+==================+
                

                Cheers,

                guy038

                1 Reply Last reply Reply Quote 0
                • A
                  Alan Kilborn
                  last edited by Feb 12, 2017, 1:20 PM

                  First of all, it is great to see such rousing discussion about the issue I discovered! :-) Thanks to all for that.

                  There are lots of things to think about coming out of this discussion, but the most obvious and immediate one is a question for Mr Guy: You keep suggesting to use \b instead of \< , but they are not always equivalent, correct? They may be equivalent for certain examples, but in the most general case I believe they are different. If they weren’t different, there would be no reason for both to exist in the N++/Boost engine…

                  I mean, even I discussed using \b instead in my very first posting in this thread, but that was just as a test, not necessarily a blanket substitution. I guess I don’t want others reading this thread to takeaway that \b and \< are the exact same thing.

                  Comments? Thoughts?

                  1 Reply Last reply Reply Quote 0
                  • M
                    MAPJe71
                    last edited by Feb 12, 2017, 1:36 PM

                    See reference on Word Boundaries for

                    • description on differences between \b, \< and \>;
                    • which “engine” supports what.
                    1 Reply Last reply Reply Quote 0
                    • G
                      guy038
                      last edited by guy038 Feb 12, 2017, 10:47 PM Feb 12, 2017, 10:38 PM

                      Hi Alan and MapJe71,

                      Thanks, MapJe71, for the link about Word Boundaries, from the definitive site about regular expressions ! Of course, Alan, I know the differences between the three assertions : \b , \< and \>. I just preferred not to speak about it, first, in order to keep concentrated on your problem !

                      To be short, the \b assertion acts, either, as a \< assertion OR as a \> assertion. This explains that the regex \<WORD\> can be simply replaced by the regex \bWORD\b.

                      BTW, in the Words Boundaries table, I noticed the POSIX word boundaries ( [[:<:]] and [[:>:]] ) which have, exactly, the same meaning as the GNU word boundaries \< and >\ ). These syntaxes are functional, with the N++ Boost regex engine ! Unfortunately, Alan, the problem that you noticed does occur with the POSIX word boundaries, too :-((.


                      On top of that, from the LAST row of the “Word Boundaries” table, named Word Boundaries behaviour, it is said that “word boundaries” are not correctly handled, in most regex engines :

                      Word boundaries always match at the start of the match attempt if that position is followed by a word character, regardless of the character that precedes the start of the match attempt. (Thus, word boundaries are not handled correctly for the second and following match attempts in the same string.)

                      And it shows an example :

                      \b. matches all of the letters but not the space when iterating over all matches, in the string “abc def”


                      So, I did some tests ( again !! )

                      • I copied this single sentence, below, part of the license.txt file, in a new tab
                      By contrast, the GNU General Public License is intended to guarantee your freedom...
                      
                      • In the Find dialog, I left the Match case and the . matches newline options UNCHECKED

                      • I selected, of course, the Regular expression search mode

                      • I tested the different regexes, below, against the example text

                      REMARK : In the table, below, each dash character, under the sentence, indicates a match of the corresponding regex(es) !

                      ========================================================================================================================
                      |     REGEXES     |                EXAMPLE text    -     MATCHES noted by a DASH character               |   RESULTS   |
                      ========================================================================================================================
                      |                 |                                                                                      |             |
                      |                 | By contrast, the GNU General Public License is intended to guarantee your freedom... | INCORRECT ! |
                      |  (^|(?<!\w)).   | ------------------------------------------------------------------------------------ |             |
                      |                 |                                                                                      |             |
                      +-----------------+--------------------------------------------------------------------------------------+-------------+
                      |                 |                                                                                      |             |
                      |  \b.            |                                                                                      |             |
                      |  \<.            |                                                                                      |             |
                      |  [[:<:]].       |                                                                                      |             |
                      |                 |                                                                                      |             |
                      |                 | By contrast, the GNU General Public License is intended to guarantee your freedom... | INCORRECT ! |
                      |  \b\w           | -- --------  --- --- ------- ------ ------- -- -------- -- --------- ---- -------    |             |
                      |  \<\w           |                                                                                      |             |
                      |  [[:<:]]\w      |                                                                                      |             |
                      |  (^|(?<!\w))\w  |                                                                                      |             |
                      |                 |                                                                                      |             |
                      +-----------------+--------------------------------------------------------------------------------------+-------------+
                      |                 |                                                                                      |             |
                      |                 | By contrast, the GNU General Public License is intended to guarantee your freedom... |  INCORRECT  |
                      |  (^|(?<=\W)).   | -  -        -    -   -       -      -       -  -        -  -         -    -       -  |             |
                      |                 |                                                                                      |             |
                      +-----------------+--------------------------------------------------------------------------------------+-------------+
                      |                 |                                                                                      | (At last !) |
                      |                 | By contrast, the GNU General Public License is intended to guarantee your freedom... |             |
                      |  (^|(?<=\W))\w  | -  -         -   -   -       -      -       -  -        -  -         -    -          |   CORRECT   |
                      |                 |                                                                                      |             |
                      ==================+======================================================================================+==============
                      |                 |                                                                                      |             |
                      |                 | By contrast, the GNU General Public License is intended to guarantee your freedom... | INCORRECT ! |
                      |  .\b            |  --       - -  --  --      --     --      -- --       -- --        --   --      -    |             |
                      |                 |                                                                                      |             |
                      +-----------------+--------------------------------------------------------------------------------------+-------------+
                      |                 |                                                                                      |             |
                      |                 |                                                                                      |             |
                      |  .((?=\W)|$)    | By contrast, the GNU General Public License is intended to guarantee your freedom... | INCORRECT ! |
                      |  .((?!\w)|$)    |  -        --   -   -       -      -       -  -        -  -         -    -       ---- |             |
                      |                 |                                                                                      |             |
                      |                 |                                                                                      |             |
                      +-----------------+--------------------------------------------------------------------------------------+-------------+
                      |                 |                                                                                      |             |
                      |  .\>            |                                                                                      |             |
                      |  .[[:>:]]       |                                                                                      |             |
                      |                 |                                                                                      |             |
                      |  \w\b           | By contrast, the GNU General Public License is intended to guarantee your freedom... |   CORRECT   |
                      |  \w\>           |  -        -    -   -       -      -       -  -        -  -         -    -       -    |             |
                      |  \w[[:>:]]      |                                                                                      |             |
                      |  \w((?=\W)|$)   |                                                                                      |             |
                      |  \w((?!\w)|$)   |                                                                                      |             |
                      |                 |                                                                                      |             |
                      ========================================================================================================================
                      

                      From that table, it obvious that the handle of the assertions, by the N++ Boost engine, seems quite weird !!!

                      To be coherent, only two regexes, with similar syntax, should be used :

                      • The regex (^|(?<=\W))\w, which matches the FIRST character of a word

                      • The regex \w((?=\W)|$), which matches the LAST character of a word

                      => The regex (^|(?<=\W))\w|\w((?=\W)|$) matches the first AND the last characters of a word

                      Best Regards,

                      guy038

                      1 Reply Last reply Reply Quote 0
                      25 out of 25
                      • First post
                        25/25
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors