Community
    • Login

    Potential buy of searching string like |CR|?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    8 Posts 2 Posters 5.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Harry ChenH
      Harry Chen
      last edited by

      Feature Find search string

      Harry ChenH 1 Reply Last reply Reply Quote 0
      • Harry ChenH
        Harry Chen @Harry Chen
        last edited by

        It looks there is a bug.
        If I search for ||CR I can locate it.
        Hopefully someone can check it?
        Let me give you a sample text file for you.

        Payment Code||CR|+9.0700|1058527||

        Thanks, --Harry

        1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by guy038

          Hello, Harry,

          I don’t understand very well what’s the problem, about this simple search ?!

          Actually, I don’t think that you really search the exact string “|CR|” : it would be obvious, in any search mode !

          After a while, I understood that you, probably, would like to search for the Pipe character |, at the end of a line, followed by the End of Line character(s), followed, again, with the Pipe character |, at the beginning of the next line, like below, wouldn’t you ?

          Payment Code||
          |+9.0700|1058527||
          

          If so, there are TWO ways to get it :

          • Using the Extended search mode, search for |\r\n|. Just note that should be OK for Dos/Windows files that have TWO End of Line characters : CARRIAGE RETURN \r ( \x0D ) and LINE FEED \n ( \x0A ). However, if you work with UNIX/OSX files, which have only the LINE FEED character as EOL, you’ll search for |\n|

          • Using the Regular expression search mode, search for \|\R\|. This time, the PIPE character must be escaped ( \| ), to be taken literally, as this character has a special meaning in regexes. And the \R represents any EOL ( \r\n in DOS/Windows files, \n, in Unix/OSX files or \r, in Old MAC files

          I presumed that you would like to delete this search, in order to get the final text :

          Payment Code|+9.0700|1058527||
          

          If so, just leave the Replace With zone, EMPTY !

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 0
          • Harry ChenH
            Harry Chen
            last edited by

            Thanks for checking it. Actually the issue is for a simple search for text “||CR|”.
            If you make a text sample file and paste this string “Payment Code||CR|+9.0700|1058527||” and save it. Use Notepad++ to open the file and search for “||CR|”. You should find out that you cannot find it. It looks like a bug.
            Let me know if you like a web conference so I can show you. My email is xschen2000 at hotmail.
            Sorry I was not able to write more to make it clearer as I was restricted to add more comments as a newbie in this community.
            Thanks, --Harry

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Harry,

              I still don’t understand ! Once I copied your example ( Payment Code||CR|+9.0700|1058527|| ) in a new tab, of my 6.8.1 N++ version, the search of the string ||CR| does match these characters, either in normal or extended search mode :-)

              What are your options checked, in your search window ? Just note that the Match whole word only option must be unchecked, in order to match the string ||CR|

              But, if your example is changed in Payment Code ||CR| +9.0700|1058527||, with spaces surrounding the string ||CR|, this time the string is, also, matched, when the Match whole word only option is set. Logical, as the string ||CR| is, now, surrounded by non-words characters ( spaces ) !

              Cheers,

              guy038

              1 Reply Last reply Reply Quote 0
              • Harry ChenH
                Harry Chen
                last edited by

                When Match whole word only option is checked off, the issue is gone. When Match whole word only option is checked on, I won’t be able to find the string ||CR|. My Notepad++ version is 6.8.6.
                Please let me know if you can duplicate it on that version.

                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hi, Harry,

                  I investigated, a bit, the different possibilities for a string to match, in Normal or Extended search mode, given its closed surrounding characters.

                  First of all, as a remaining, if the Match whole word only option is UNCHECKED, any occurrence of the searched string is ALWAYS found, whatever the characters of the string itself and/or the characters surrounding the string are !

                  So we only have to study, the behaviour of the search engine, when the Match whole word only option is CHECKED


                  Let’s consider the THREE classes of character, below :

                  • The Class “W” containing WORD characters \w ( that is to say, the range [0-9_A-Za-z] + all the accentuated characters and few others, in the range [\x80-\xFF] )

                  • The Class “C” containing the usual BLANK, EOL and CONTROL characters ( In other words, the range [\x00-\x20] )

                  • The Class “N” containing the NON-WORD characters , different from the CONTROL ones ( that is to say, the list of characters : [!"#$%&'()*+,./:;<=>?@\[\\\]^\``{|}~-] and few other symbols, in the range [\x80-\xFF] )

                  Then, IF the Match whole word only option is CHECKED, the SEARCHED string, will be seen, as a WHOLE WORD, and the occurrences of that string found, IF, and ONLY IF, ONE of the FOUR cases, below, is TRUE :

                  •-----------•-------------------------•-------------------------•---------------------------•---------------------------•
                  |   Cases   |   FIRST Character of    |    LAST Character of    |   Character just BEFORE   |   Character just  AFTER   |
                  |   MATCH   |   the SEARCHED string   |   the SEARCHED string   |    the SEARCHED string    |   the SEARCHED string     |
                  •-----------•-------------------------•-------------------------•---------------------------•---------------------------•
                  |           |                         |                         |                           |                           |
                  |     1     |         Class W         |         Class W         |       Class N or C        |       Class N or C        |
                  |           |                         |                         |                           |                           |
                  |     2     |         Class W         |         Class N         |       Class N or C        |       Class W or C        |
                  |           |                         |                         |                           |                           |
                  |     3     |         Class N         |         Class W         |       Class W or C        |       Class N or C        |
                  |           |                         |                         |                           |                           |
                  |     4     |         Class N         |         Class N         |       Class W or C        |       Class W or C        |
                  |           |                         |                         |                           |                           |
                  •-----------•-------------------------•-------------------------•---------------------------•---------------------------•
                  

                  In ALL the other cases, and, especially, when the searched string starts and/or ends with a character of the Class C, NO occurrence can be EVER found !

                  In other words, we can deduce that a string is MATCHED and seen, from the search engine, as a “whole word”, IF :

                  • The class of the character, just BEFORE the searched string, is DIFFERENT from the class of the FIRST character of the string

                  AND

                  • The class of the character, just AFTER the searched string, is DIFFERENT from the class of the LAST character of the string

                  Therefore, Harry, when you try to search for the string ||CR|, in the text Payment Code||CR|+9.0700|1058527||:

                  • That string ||CR| corresponds to the case #4, above ( The NON-word symbol | starts and ends your string )

                  • The character, just BEFORE the string, is the lowercase letter e, which belongs to the Class W

                  • The character, just AFTER the string, is the + symbol, which, unfortunately belongs to the Class N

                  So, the four conditions, of the case 4, can’t be, simultaneously, verified => Your specific string, located in your specific text, can’t be seen as a whole word only, regarding the rules above. And it can be found, only, if the Match whole word only option is UNCHECKED !

                  Cheers,

                  guy038

                  P.S. :

                  I just forgot, yesterday, to give some examples !

                  We are supposed to use the normal or extended search mode, with the Match whole word only option CHECKED, in the Find dialog !

                  So, given the list of 10 lines, below :

                  X@Y /Z
                  ABX@Y /ZF
                  A\X@Y /Z
                  A X@Y /Z-EF
                  {---X@Y /Z---}
                  [X@Y /Z]
                  AB[X@Y /Z]CD
                  ---[X@Y /Z]CD
                  AB[X@Y /Z]===
                  		[X@Y /Z]
                  

                  Note : the 10th line begins with two tabulation characters

                  Then :

                  • Search for the string X@Y /Z ( Class 1, in the table above ) => The lines 1 and from 3 to 10, included, are matched

                  • Search for the string X@Y /Z] ( Class 2, in the table above ) => The lines 6 to 8, included, and line 10 are matched

                  • Search for the string [X@Y /Z ( Class 3, in the table above ) => The lines 6, 7, 9 and 10 are matched

                  • Search for the string [X@Y /Z] ( Class 4, in the table above ) => The lines 6, 7 and 10 are matched

                  1 Reply Last reply Reply Quote 0
                  • Harry ChenH
                    Harry Chen
                    last edited by Harry Chen

                    Thanks so much for your detailed reply. Appreciate it.
                    Not sure what purpose the “Match whole word only option” is designed for but it looks it might be working as designed. We just need to remember that option has to be UNCHECKED.
                    For a light user of Notepad ++ I think that “Match whole word only option” is misleading for me. I have to cross my fingers wishing normal Notepad ++ users know the impact of that option.
                    Frankly I am not fully understanding why the simple search feature involves so many hidden categories and criteria.

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors