Replace specific text that only exists outside of quotation marks / double-quotes.
-
Fellow Notepad++ Users,
Could you please help me with the following search-and-replace problem I am having?
I would like to replace specific characters (such as I, me, my, our, us) that only appear outside of double quotation makrs (" ") with alternative text.
Here is an example of some text prior to being edited:
Just as I was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, I asked Master Palaemon who Vodalus of the Wood was. "How often have I explained that nothing said by a client under questioning is heard by you?" "Many times, Master." I said.
Here is how I would like the text to look after applying the edit:
Just as Severian was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, Severian asked Master Palaemon who Vodalus of the Wood was. "How often have I explained that nothing said by a client under questioning is heard by you?" "Many times, Master." Severian said.
Essentially, I’m trying to figure out a RegEx code, or possibly be pointed towards an available Python Script, that will allow me to find specific text that only exists outside of double-quotation marks.
To accomplish this, I have tried using the following Find/Replace expressions and settings
Find What = (I )(?=(?:[^"]*"[^"]*")*[^"]*$) Replace With = Replacement Search Mode = REGULAR EXPRESSION Dot Matches Newline = Attempted both CHECKED and NOT CHECKED
HERE IS WHY YOU THOUGHT YOUR EXPRESSION WOULD WORK: Hours of searching and trying out different things. I do try my best before asking for help, but I’ve been unable to find this exact scenario on the different forums that are availabe. Any help would be appreciated.
Unfortunately, this did not produce the output I desired, and I’m not sure why. Could you please help me understand what went wrong and help me find the solution?
Thank you.
-
I believe I figured out how to make it work. I converted the straight punctuation in the text over to dedicated left-right punctuation marks, and updated the RegEx, and it seeeems to work so far.
Updated code : I (?=(?:[^“]*“[^”]*”)*[^”]*$)
-
If you are still having issues we have a FAQ section which happens to have a post referring to some generic solutions, one of which could help you. The referred to post is here.
As a suggestion you could change the existing quoted text by changing the beginning and ending pairs of quotes to something like
@"
and"@
which can then be used in the generic regex to identify those start and end sections. Your start and ends will actually be the end and start of 2 different quoted sections, leaving just the very start and end sections of the file to manually adjust. After completing your changes you would restore those quotes.Sometimes breaking down the problem can help identify sections of the process which are easy to work on. If you are lucky it can sometimes make an unworkable problem solvable.
Another idea was to cut all quoted sections out of the file prior to editing, then insert them back. This would involve adding line numbers and using markers to identify where sections were removed.
Good luck
Terry -
Housekeeping:
- need to accomodate case where
I
appears at start of line - your recent post included undesired fancy quotes, likely due to copying text from web page
I took an approach to match and capture as follows:
<space or linestart (as CG2)><I
><space> (all of this as CG1)
logical-OR
<"
><any text><"
> (as CG3)If you search on
((^|\h)I\h)|(".*?")
you’ll see how it works.On a match we’re assured: <both CG1 & CG2 are defined> (exclusive)OR <CG3 is defined>.
Replace expression will do: if CG1 is defined, write back replacement text with appropriate spacing; otherwise write back CG3
?1\2BORIS\x20:\3
For testing I added a line at start to demonstrate when there’s an
I
at the start:I decided I had to leave for Gorzof. Just as I was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, I asked Master Palaemon who Vodalus of the Wood was. "How often have I explained that nothing said by a client under questioning is heard by you?" "Many times, Master." I said.
BORIS decided BORIS had to leave for Gorzof. Just as BORIS was about to leave the woman said, "I don't know. Only, oh, can't you believe I wouldn't tell you if I did? She's gone with Vodalus of the Wood, I don't know where." Outside, feigning ignorance, BORIS asked Master Palaemon who Vodalus of the Wood was. "How often have I explained that nothing said by a client under questioning is heard by you?" "Many times, Master." BORIS said.
- need to accomodate case where
-
Hello, @brent-parker, @terry-r, @neil-schipper and All,
So, @brent-parker, given this INPUT text :
I run this regex test " but I suppose it will be OK ". However, I must verify this assertion. We went out for a walk in the country-side. " On the way back, we can take some cakes ". We often come to this store. They are going to the movies tonight. " They hope it will be interesting ". They had, indeed, been disappointed the previous time!
Use this regex S/R :
SEARCH
(?x-is) " .+? " (*SKIP) (*F) | \b (?: ( I ) | ( We ) | ( They ) ) \b
REPLACE
(?1Guy)(?2Bob and me)(?3The group)
And you’ll get your expected OUTPUT text :
Guy run this regex test " but I suppose it will be OK ". However, Guy must verify this assertion. Bob and me went out for a walk in the country-side. " On the way back, we can take some cakes ". Bob and me often come to this store. The group are going to the movies tonight. " They hope it will be interesting ". The group had, indeed, been disappointed the previous time!
Notes :
-
The search regex uses a non-common modifier
(?x)
which enables the free-space mode for a better identification of the regex parts -
The search regex uses a special structure, called
Backtracking Control Verbs
. This specific form(*SKIP) (*F)
can be understood as :
What I don’t want
(*SKIP) (*F) |
What I want-
We do not want everything between double-quotes and we do want to replace some words with an alternative text, accordingly !
-
So, any word
I
is changed byGuy
, any wordWe
is changed intoBob and me
and any wordThey
is replaced byThe group
, when they all are not found within double-quotes zones ;-))
Best Regards,
guy038
-