Community
    • Login

    Strange behavior of Find All in Current Document with Regular Expression

    Scheduled Pinned Locked Moved General Discussion
    5 Posts 3 Posters 494 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Todd HoatsonT
      Todd Hoatson
      last edited by

      I am using 32-bit N++ 7.3, and I have run into IMO some very strange behavior, which I would consider a bug. However, I’m open to having someone show me that this is working as designed, or something…

      I have quite a large (116MB) text file containing XML. I was hoping to make it smaller by removing some unnecessary XML elements, using Regular expression search and replace. To test the search part of this, I set up my regular expression query like this:

      36911f67-87df-4bea-94cd-359ee2eb124d-image.png

      When I click on Find Next, it correctly finds the first match.

      But when I Cancel the Find dialog, go back to the top of the document, then do Find again, this time clicking on Find All in Current Document, it automagically changes the search parameters to prior to searching and doesn’t find anything! Here are the changed search parameters:

      73415e68-2e2a-4b06-bcf4-9dfa5cde8e68-image.png

      The search string now has \n inserted into it, and the Search Mode has been changed! This, of course, is why there are no matches.

      I tried the same thing where the file size is only 3.6MB and contains only 1 match, but the same problem occurs.

      Is this working as designed??? Is this somehow a result of working in a large file? (If so, is there a known limit to the size of file that N++ can reliably handle?)

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Todd Hoatson
        last edited by

        @Todd-Hoatson

        Notepad++ 7.3 was released on the first day of 2017, making it fairly “old” by today’s measurement standards.
        I would suggest upgrading to the latest release and trying your scenario again.

        1 Reply Last reply Reply Quote 1
        • Todd HoatsonT
          Todd Hoatson
          last edited by

          Sorry, I didn’t realize that there is no automatic update feature in N++, so I didn’t think my version was so back-level. (Is it because I am using N++ portable?)

          I tried again, this time I found that clicking on Find Next no longer works. So there has been some kind of change. Even though I have checked . matches newline, the \w option no longer seems to match the CR or LF characters.

          So I changed my RegEx expression to: </text>[\w\n\r]*</tei.2>

          This now works. Or, at least it seems to give me what I would expect.

          Alan KilbornA PeterJonesP 2 Replies Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @Todd Hoatson
            last edited by

            @Todd-Hoatson said in Strange behavior of Find All in Current Document with Regular Expression:

            the \w option no longer seems to match the CR or LF characters.

            Hmm… \w has always matched a “word” character, and since line-ending characters are not word characters, they don’t match.

            You don’t really discuss the specifics of the data you are trying to match.
            I might take a “wild guess” that the following regular expression is what you want:

            (?s)</text>.*?</tei.2>

            but as I said, it is a wild guess.

            BTW, the usage of (?s) at the start of the regex allows one to ignore the . matches newline checkbox (something we do in solutions presented here, because it keeps things simpler).

            (?s) is equivalent to . matches newline ticked
            (?-s) is equivalent to . matches newline unticked

            1 Reply Last reply Reply Quote 2
            • PeterJonesP
              PeterJones @Todd Hoatson
              last edited by

              @Todd-Hoatson said in Strange behavior of Find All in Current Document with Regular Expression:

              I didn’t realize that there is no automatic update feature in N++, so I didn’t think my version was so back-level. (Is it because I am using N++ portable?)

              The portable instance does not enable the auto-updater, because the auto-updater downloads the installer, which would really confuse users.

              If you’re unsure about the version, you can always look here in the Announcements thread, or at the official downloads page, or at the latest github release for the truly most-recent release… Or you can look at the same URL as the updater uses (https://notepad-plus-plus.org/update/getDownloadUrl.php) to see what version is being pushed to the updater as “most recent” (that XML at that URL only gets updated once the developer is convinced a release is stable enough to go to auto-update)

              1 Reply Last reply Reply Quote 1
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors