Community
    • Login

    "Whole Word Only" Option in Combination with Non-Alphanumeric Characters

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    13 Posts 4 Posters 858 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Thomas KnoefelT
      Thomas Knoefel @Alan Kilborn
      last edited by

      @Alan-Kilborn
      Thanks for the hint. I checked the help documentation, and it seems this behavior is normal and something I’ll need to get used to.

      https://npp-user-manual.org/docs/searching/

      Alan KilbornA 1 Reply Last reply Reply Quote 1
      • Alan KilbornA
        Alan Kilborn @Thomas Knoefel
        last edited by

        @Thomas-Knoefel

        I don’t know if searching in regex mode works “better” or not:

        e.g. \b\QNotepad++\E\b

        Thomas KnoefelT 1 Reply Last reply Reply Quote 1
        • Thomas KnoefelT
          Thomas Knoefel @Alan Kilborn
          last edited by

          @Alan-Kilborn
          I was just curious about this when I first recognized it in the plugin and thought it was a bug. However, after realizing that this behavior is normal, I found it a bit odd. Since this behavior cannot be changed, so it’s just part of the feature set of the plugin then.

          CoisesC 1 Reply Last reply Reply Quote 0
          • CoisesC
            Coises @Thomas Knoefel
            last edited by

            @Thomas-Knoefel

            Based on the Scintilla documentation:

            https://www.scintilla.org/ScintillaDoc.html#searchFlags

            the plain text search with whole word enabled should be equivalent to:

            (?<!\w)\QNotepad++\E(?!\w)

            The implementation says otherwise:

            https://github.com/notepad-plus-plus/notepad-plus-plus/blob/7a401cfacef20a962bdd0cb1cdc47f8a96c0b85a/scintilla/src/Document.cxx#L2063

            “Whole word” effectively implies a word boundary; so it behaves like @Alan-Kilborn’s suggestion:

            \b\QNotepad++\E\b

            and not like the documentation indicates.

            PeterJonesP 1 Reply Last reply Reply Quote 1
            • PeterJonesP
              PeterJones @Coises
              last edited by PeterJones

              @Coises ,

              I respectfully disagree.

              With text Notepad,Notepad++,Notepad, search for Notepad++ with Whole Word checkmarked. It won’t be found, because there is no character-class difference between the + at the end of the word and the comma (,) after it that’s not part of the search string. Thus, the “Check that the given range is has transitions between character classes at both” comment that was in the source-code link is not fulfilled (+ to , is not a character class transition).

              And the manual says, “If the left of your search string is a word character and the right is not (or vice versa), then the characters to the left and right must be of the opposite type, or be spaces, or be the beginning/ending of a line.” The left of the search string is N, so a word character; the right is + so punctuation; thus, it would have to be non-word to the left of the N (it is) and word or space to the right of the + (it is a comma, so punctuation, which is neither), thus the Manual correctly describes that Whole Word will not match for that.

              The Whole Word search for Notepad++ in the string Notepad,Notepad++,Notepad is behaving as described in both the comments of the source code and in the User Manual.

              CoisesC 1 Reply Last reply Reply Quote 3
              • CoisesC
                Coises @PeterJones
                last edited by

                @PeterJones said in "Whole Word Only" Option in Combination with Non-Alphanumeric Characters:

                The Whole Word search for Notepad++ in the string Notepad,Notepad++,Notepad is behaving as described in both the comments of the source code and in the User Manual.

                Indeed, it does.

                The documentation I looked at was the Scintilla documentation for the search flags. (Probably because I was thinking more as a plugin developer than as an end user.)

                The Notepad++ User Manual documentation describes the actual behavior correctly.

                PeterJonesP 1 Reply Last reply Reply Quote 3
                • PeterJonesP
                  PeterJones @Coises
                  last edited by PeterJones

                  @Coises said in "Whole Word Only" Option in Combination with Non-Alphanumeric Characters:

                  The documentation I looked at was the Scintilla documentation

                  Ah, okay, I misunderstood which “documentation” you were referring to. I agree that Scintilla’s description doesn’t cover the edge cases, though it probably should. (Who knows if they’ve even bothered to learn their own edge cases; I get the feeling that Notepad++ and it’s associated plugin authors push Scintilla in ways that the Scintilla developers never expected; though presumably other apps that use Scintilla push things in different directions than we do.)

                  Thomas KnoefelT CoisesC 2 Replies Last reply Reply Quote 3
                  • Thomas KnoefelT
                    Thomas Knoefel @PeterJones
                    last edited by Thomas Knoefel

                    I’ve been thinking about how the “Match Whole Word Only” search option might function:

                    1. Text Segmentation: The entire text is divided into chunks by separating at non-word characters, ensuring symbols like ‘+’ are not included in these chunks.
                    2. Search Within Chunks: The search then strictly focuses on these separated chunks.

                    Interestingly, spaces seem to act as primary separators, overruling other non-word characters if they are next to these characters and including adjacent non-word characters into the chunks.

                    This chunk preparation, which happens without analyzing the search string, likely makes the search process faster, especially if the text is pre-prepared. Finally the search will only focus on these seperated chunks.

                    Alan KilbornA 1 Reply Last reply Reply Quote 0
                    • Alan KilbornA
                      Alan Kilborn @Thomas Knoefel
                      last edited by

                      @Thomas-Knoefel

                      Interesting idea. If you’re a scripter, maybe mockup some demo with a script and show it here?

                      1 Reply Last reply Reply Quote 0
                      • CoisesC
                        Coises @PeterJones
                        last edited by Coises

                        @PeterJones said in "Whole Word Only" Option in Combination with Non-Alphanumeric Characters:

                        I agree that Scintilla’s description doesn’t cover the edge cases, though it probably should.

                        Having just discovered — to some horror — that Notepad++ uses a modified version of Scintilla (context here) I am no longer inclined to “blame” Scintilla for anything without first doing a lot of investigation.

                        I somehow just assumed that modifying Scintilla would be “off limits” for the Notepad++ project.

                        Alan KilbornA 1 Reply Last reply Reply Quote 3
                        • Alan KilbornA
                          Alan Kilborn @Coises
                          last edited by

                          @Coises said in "Whole Word Only" Option in Combination with Non-Alphanumeric Characters:

                          that modifying Scintilla would be “off limits” for the Notepad++ project

                          It mostly is.
                          But I think in some areas it was judged to be something that “had to be done”.

                          1 Reply Last reply Reply Quote 2
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors