Community
    • Login

    What regex to create a search limit boundary?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    regex
    4 Posts 2 Posters 494 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • AporosaA Offline
      Aporosa
      last edited by

      I need some kind of ‘stop search here / start again’ expression, as in the problem here:

      The examples here are from downloaded metadata in html, from the naturalis.nl biodiversity database. Species names are split up by markup code, which I need to strip out.

      Here’s a simple example:
      <span class=“scientific”>Piper</span> L.

      So, stripping the markup out is a simple search and replace, resulting in the scientific genus Piper L.
      Find: <span class=“scientific”>(.*)</span>
      Replace: $1
      Result: Piper L.

      But that strategy does not work on this more common binomial name example:
      Desired result: Ficus septica Burm. fil.
      Original html: <span class=“scientific”>Ficus</span> <span class=“scientific”>septica</span> Burm. fil.
      Find: <span class=“scientific”>(.*)</span>
      Replace: $1
      Result: Ficus</span> <span class=“scientific”>septica Burm. fil.

      I seem to need a way to stop the regex after the first <span class =></span> couple, and start afresh for the next one. Instead, it is skipping to the last </span> occurrence.

      How can I improve my regex code?

      1 Reply Last reply Reply Quote 0
      • Terry RT Offline
        Terry R
        last edited by Terry R

        @Aporosa said in What regex to create a search limit boundary?:

        How can I improve my regex code?

        My small test shows this seems to work:
        Find What<([^>]+)?>
        Replace With: nothing in this field

        Terry

        PS this will also find any sequences of <> and since they make no sense being there this removes them as well.

        1 Reply Last reply Reply Quote 1
        • Terry RT Offline
          Terry R
          last edited by Terry R

          @Aporosa said in What regex to create a search limit boundary?:

          How can I improve my regex code?

          I looked more into your regex and the reason it didn’t work as planned was the (.*) is greedy. If your were to change your regex to <span class=“scientific”>(.*?)</span>, note the inclusion of a ? changes the regex from being greedy to lazy (not-greedy).

          The difference is that your’s will find the longest sequence that the regex can be true with, the adjusted regex finds the shortest length the regex is true with.

          Terry

          1 Reply Last reply Reply Quote 3
          • AporosaA Offline
            Aporosa
            last edited by

            @Terry-R said in What regex to create a search limit boundary?:

            note the inclusion of a ? changes the regex from being greedy to lazy (not-greedy)

            Thank you Terry R, not only was this a very clearly explained and effective solution, but now I understand the idea of a ‘greedy’ regex expression, something I didn’t get from textbook examples that I’d read.

            1 Reply Last reply Reply Quote 2

            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

            With your input, this post could be even better 💗

            Register Login
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors