Community
    • Login

    Replace specific text that only exists outside of quotation marks / double-quotes.

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 4 Posters 1.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Brent ParkerB
      Brent Parker
      last edited by Brent Parker

      Fellow Notepad++ Users,

      Could you please help me with the following search-and-replace problem I am having?

      I would like to replace specific characters (such as I, me, my, our, us) that only appear outside of double quotation makrs (" ") with alternative text.

      Here is an example of some text prior to being edited:

      Just as I was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, I asked Master Palaemon who Vodalus of the Wood was.
      
      "How often have I explained that nothing said by a client under questioning is heard by you?"
      
      "Many times, Master." I said.
      

      Here is how I would like the text to look after applying the edit:

      Just as Severian was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, Severian asked Master Palaemon who Vodalus of the Wood was.
      
      "How often have I explained that nothing said by a client under questioning is heard by you?"
      
      "Many times, Master." Severian said.
      

      Essentially, I’m trying to figure out a RegEx code, or possibly be pointed towards an available Python Script, that will allow me to find specific text that only exists outside of double-quotation marks.

      To accomplish this, I have tried using the following Find/Replace expressions and settings

      Find What = (I )(?=(?:[^"]*"[^"]*")*[^"]*$)
      Replace With = Replacement 
      Search Mode = REGULAR EXPRESSION
      Dot Matches Newline = Attempted both CHECKED and NOT CHECKED
      

      HERE IS WHY YOU THOUGHT YOUR EXPRESSION WOULD WORK: Hours of searching and trying out different things. I do try my best before asking for help, but I’ve been unable to find this exact scenario on the different forums that are availabe. Any help would be appreciated.

      Unfortunately, this did not produce the output I desired, and I’m not sure why. Could you please help me understand what went wrong and help me find the solution?

      Thank you.

      Brent ParkerB 1 Reply Last reply Reply Quote 0
      • Brent ParkerB
        Brent Parker @Brent Parker
        last edited by

        @Brent-Parker

        I believe I figured out how to make it work. I converted the straight punctuation in the text over to dedicated left-right punctuation marks, and updated the RegEx, and it seeeems to work so far.

        Updated code : I (?=(?:[^“]*“[^”]*”)*[^”]*$)
        
        Terry RT Neil SchipperN 2 Replies Last reply Reply Quote 0
        • Terry RT
          Terry R @Brent Parker
          last edited by Terry R

          @Brent-Parker

          If you are still having issues we have a FAQ section which happens to have a post referring to some generic solutions, one of which could help you. The referred to post is here.

          As a suggestion you could change the existing quoted text by changing the beginning and ending pairs of quotes to something like @" and "@ which can then be used in the generic regex to identify those start and end sections. Your start and ends will actually be the end and start of 2 different quoted sections, leaving just the very start and end sections of the file to manually adjust. After completing your changes you would restore those quotes.

          Sometimes breaking down the problem can help identify sections of the process which are easy to work on. If you are lucky it can sometimes make an unworkable problem solvable.

          Another idea was to cut all quoted sections out of the file prior to editing, then insert them back. This would involve adding line numbers and using markers to identify where sections were removed.

          Good luck
          Terry

          1 Reply Last reply Reply Quote 2
          • Neil SchipperN
            Neil Schipper @Brent Parker
            last edited by

            @Brent-Parker

            Housekeeping:

            • need to accomodate case where I appears at start of line
            • your recent post included undesired fancy quotes, likely due to copying text from web page

            I took an approach to match and capture as follows:
            <space or linestart (as CG2)><I><space> (all of this as CG1)
            logical-OR
            <"><any text><"> (as CG3)

            If you search on ((^|\h)I\h)|(".*?") you’ll see how it works.

            On a match we’re assured: <both CG1 & CG2 are defined> (exclusive)OR <CG3 is defined>.

            Replace expression will do: if CG1 is defined, write back replacement text with appropriate spacing; otherwise write back CG3

            ?1\2BORIS\x20:\3

            For testing I added a line at start to demonstrate when there’s an I at the start:

            I decided I had to leave for Gorzof.
            
            Just as I was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, I asked Master Palaemon who Vodalus of the Wood was.
            
            "How often have I explained that nothing said by a client under questioning is heard by you?"
            
            "Many times, Master." I said.
            
            BORIS decided BORIS had to leave for Gorzof.
            
            Just as BORIS was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, BORIS asked Master Palaemon who Vodalus of the Wood was.
            
            "How often have I explained that nothing said by a client under questioning is heard by you?"
            
            "Many times, Master." BORIS said.
            
            1 Reply Last reply Reply Quote 1
            • guy038G
              guy038
              last edited by guy038

              Hello, @brent-parker, @terry-r, @neil-schipper and All,

              So, @brent-parker, given this INPUT text :

              I run this regex test " but I suppose it will be OK ". However, I must verify this assertion.
              
              We went out for a walk in the country-side. " On the way back, we can take some cakes ". We often come to this store.
              
              They are going to the movies tonight. " They hope it will be interesting ". They had, indeed, been disappointed the previous time!
              

              Use this regex S/R :

              SEARCH (?x-is) " .+? " (*SKIP) (*F) | \b (?: ( I ) | ( We ) | ( They ) ) \b

              REPLACE (?1Guy)(?2Bob and me)(?3The group)

              And you’ll get your expected OUTPUT text :

              Guy run this regex test " but I suppose it will be OK ". However, Guy must verify this assertion.
              
              Bob and me went out for a walk in the country-side. " On the way back, we can take some cakes ". Bob and me often come to this store.
              
              The group are going to the movies tonight. " They hope it will be interesting ". The group had, indeed, been disappointed the previous time!
              

              Notes :

              • The search regex uses a non-common modifier (?x) which enables the free-space mode for a better identification of the regex parts

              • The search regex uses a special structure, called Backtracking Control Verbs. This specific form (*SKIP) (*F) can be understood as :

              What I don’t want (*SKIP) (*F) | What I want

              • We do not want everything between double-quotes and we do want to replace some words with an alternative text, accordingly !

              • So, any word I is changed by Guy, any word We is changed into Bob and me and any word They is replaced by The group, when they all are not found within double-quotes zones ;-))

              Best Regards,

              guy038

              1 Reply Last reply Reply Quote 1
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors