Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    Strange behavior of Find All in Current Document with Regular Expression

    General Discussion
    3
    5
    66
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Todd Hoatson
      Todd Hoatson last edited by

      I am using 32-bit N++ 7.3, and I have run into IMO some very strange behavior, which I would consider a bug. However, I’m open to having someone show me that this is working as designed, or something…

      I have quite a large (116MB) text file containing XML. I was hoping to make it smaller by removing some unnecessary XML elements, using Regular expression search and replace. To test the search part of this, I set up my regular expression query like this:

      36911f67-87df-4bea-94cd-359ee2eb124d-image.png

      When I click on Find Next, it correctly finds the first match.

      But when I Cancel the Find dialog, go back to the top of the document, then do Find again, this time clicking on Find All in Current Document, it automagically changes the search parameters to prior to searching and doesn’t find anything! Here are the changed search parameters:

      73415e68-2e2a-4b06-bcf4-9dfa5cde8e68-image.png

      The search string now has \n inserted into it, and the Search Mode has been changed! This, of course, is why there are no matches.

      I tried the same thing where the file size is only 3.6MB and contains only 1 match, but the same problem occurs.

      Is this working as designed??? Is this somehow a result of working in a large file? (If so, is there a known limit to the size of file that N++ can reliably handle?)

      Alan Kilborn 1 Reply Last reply Reply Quote 0
      • Alan Kilborn
        Alan Kilborn @Todd Hoatson last edited by

        @Todd-Hoatson

        Notepad++ 7.3 was released on the first day of 2017, making it fairly “old” by today’s measurement standards.
        I would suggest upgrading to the latest release and trying your scenario again.

        1 Reply Last reply Reply Quote 1
        • Todd Hoatson
          Todd Hoatson last edited by

          Sorry, I didn’t realize that there is no automatic update feature in N++, so I didn’t think my version was so back-level. (Is it because I am using N++ portable?)

          I tried again, this time I found that clicking on Find Next no longer works. So there has been some kind of change. Even though I have checked . matches newline, the \w option no longer seems to match the CR or LF characters.

          So I changed my RegEx expression to: </text>[\w\n\r]*</tei.2>

          This now works. Or, at least it seems to give me what I would expect.

          Alan Kilborn PeterJones 2 Replies Last reply Reply Quote 0
          • Alan Kilborn
            Alan Kilborn @Todd Hoatson last edited by

            @Todd-Hoatson said in Strange behavior of Find All in Current Document with Regular Expression:

            the \w option no longer seems to match the CR or LF characters.

            Hmm… \w has always matched a “word” character, and since line-ending characters are not word characters, they don’t match.

            You don’t really discuss the specifics of the data you are trying to match.
            I might take a “wild guess” that the following regular expression is what you want:

            (?s)</text>.*?</tei.2>

            but as I said, it is a wild guess.

            BTW, the usage of (?s) at the start of the regex allows one to ignore the . matches newline checkbox (something we do in solutions presented here, because it keeps things simpler).

            (?s) is equivalent to . matches newline ticked
            (?-s) is equivalent to . matches newline unticked

            1 Reply Last reply Reply Quote 2
            • PeterJones
              PeterJones @Todd Hoatson last edited by

              @Todd-Hoatson said in Strange behavior of Find All in Current Document with Regular Expression:

              I didn’t realize that there is no automatic update feature in N++, so I didn’t think my version was so back-level. (Is it because I am using N++ portable?)

              The portable instance does not enable the auto-updater, because the auto-updater downloads the installer, which would really confuse users.

              If you’re unsure about the version, you can always look here in the Announcements thread, or at the official downloads page, or at the latest github release for the truly most-recent release… Or you can look at the same URL as the updater uses (https://notepad-plus-plus.org/update/getDownloadUrl.php) to see what version is being pushed to the updater as “most recent” (that XML at that URL only gets updated once the developer is convinced a release is stable enough to go to auto-update)

              1 Reply Last reply Reply Quote 1
              • First post
                Last post
              Copyright © 2014 NodeBB Forums | Contributors