• Login
Community
  • Login

Find (+n) and replace

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 2 Posters 313 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    alexarda
    last edited by Jun 3, 2020, 3:36 AM

    Hi all

    I’m trying to find the best way to input a character 14 characters after the end of a variable with Notepad++ and PythonScript. My data looks like this:

    <p><a href="http://www.webaddress1">Website 1</a>, Bob's Brilliant Blog 1 Jun 2020</p>
    <p><a href="https://www.webaddress2">Website 2</a>, Rachel's Raucous Readings 30 May 2020</p>
    <p><a href="https://www.webaddress3">Website 3</a>, Alex's Awful Arias 29 May 2020</p>
    <p><a href="http://www.webaddress4">Website 4</a>, Bob's Brilliant Blog 28 May 2020</p>
    

    The date will always be changing, but will only ever use three letter abbreviations (Apr, Jul, Sep etc).

    The end goal is to find all examples where “Bob’s Brilliant Blog” appears and add a single character (“1”) 14 characters after the end of “Blog”, but before </p>.

    <p><a href="http://www.webaddress1">Website 1</a>, Bob's Brilliant Blog 1 Jun 2020  1</p>
    <p><a href="https://www.webaddress2">Website 2</a>, Rachel's Raucous Readings 30 May 2020</p>
    <p><a href="https://www.webaddress3">Website 3</a>, Alex's Awful Arias 29 May 2020</p>
    <p><a href="http://www.webaddress4">Website 4</a>, Bob's Brilliant Blog 28 May 2020 1</p>
    

    Can anyone point me in the right direction?

    E 1 Reply Last reply Jun 3, 2020, 7:57 AM Reply Quote 0
    • E
      Ekopalypse @alexarda
      last edited by Jun 3, 2020, 7:57 AM

      @alexarda

      If I understand your question correctly, then I think this can do it

      search_string = r"Bob's Brilliant Blog"    
      re_search_for = r"(?<={0}).*(?=</p>)".format(search_string)
      editor.rereplace(re_search_for, lambda m: '{0:<13}1'.format(m.group()))
      

      Let me know if you need a description of the code.

      A 1 Reply Last reply Jun 3, 2020, 11:17 PM Reply Quote 3
      • A
        alexarda @Ekopalypse
        last edited by Jun 3, 2020, 11:17 PM

        @Ekopalypse

        Really, really appreciate this. It works! Thank you!

        For my own education do you mind stepping through what’s happening with the code?

        E 1 Reply Last reply Jun 4, 2020, 9:50 AM Reply Quote 0
        • E
          Ekopalypse @alexarda
          last edited by Jun 4, 2020, 9:50 AM

          @alexarda

          search_string = r “Bob’s Brilliant Blog”
          re_search_for = r"(?<={0}).*(?=</p>)".format(search_string)
          editor.rereplace(re_search_for, lambda m: ‘{0:<13}1’.format(m.group())

          search_string is only intended for easier editing if you use also want to search for other strings that follow the same pattern.

          re_search_for is then the actual search_string which contains a regex string assembled from 3 parts

          1. (?<={0}) the placeholder {0} is filled via the format function
            result is (?<=Bob’s Brilliant Blog)
          2. .* I’m sure you know what that means, match all or nothing, greedy
          3. (?=</p>) which means that a previous match only occures if this follows.

          All in all that means, look for anything with Bob’s Brilliant Blog
          begins and ends with </p>.

          editor.rereplace now searches via the regex pattern and forwards every match to
          a function that expects a match object as a parameter and is defined here with lambda.

          ‘{0:<13}1’.format(m.group()) finally means
          0 = what is contained in m.group() and :<13 means the strings of m.group
          is filled up with blanks of up to 13 characters if necessary.
          The 1 after it is simply appended.

          That’s it. :-)

          Now that I wrote it I think a safer search would be in a non-greedy manner.

          replace this

          r"(?<={0}).*(?=</p>)"
          

          with that

          r"(?<={0}).+?(?=</p>)"
          

          The changes means, there must be anything and the regex matches as less as possible to meat the requirement.

          1 Reply Last reply Reply Quote 3
          2 out of 4
          • First post
            2/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors