HELP! Replace spaces in text ONLY between quotes?
-
@PeterJones (Quick clarification for anyone reading this is the future:
"folderA/folderB/“file name with spaces”;
Should be:
"folderA/folderB/file name with spaces”;
(The extra quote was a typing mistake.)
-
Hello, @ugsys-rapSheet, @Peterjones and All,
I’ve been thinking about the general problem of doing a replacement, ONLY between zones delimited with double-quotes (
"......"
)The difficulty comes from the fact that the
2
delimiters are identical. So in order to help, we must, first, change, temporarily, the"
ending delimiter into an other character, not used yet in current file. I chose the#
character but any other character, even a control char, would be fine if not already present in current file.So, assuming the test sample, below :
This is a " C:\aaa zzz \file with spaces" test to verify if the regex "C:\ aaa zzz\file with spaces.txt " works correctly " C:\aaa zzz \file with spaces" test to verify if the regex "C:\ aaa zzz\file with spaces.txt " works correctly This is a " C:\aaa zzz \file with spaces" test to verify if the regex "C:\ aaa zzz\file with spaces.txt " " C:\aaa zzz \file with spaces" test to verify if the regex "C:\ aaa zzz\file with spaces.txt " This is a " file with spaces" test to verify if the regex "file with spaces.txt " works correctly " file with spaces" test to verify if the regex "file with spaces.txt " works correctly This is a " file with spaces" test to verify if the regex "file with spaces.txt " " file with spaces" test to verify if the regex "file with spaces.txt " This is a " C:\aaa zzz \file with spaces" test to verify if the regex "C:\ aaa zzz\file with spaces.txt " works correctly
The first regex S/R is, obviously :
SEARCH
"(.+?)"
REPLACE
"\1#
And we would end with :
This is a " C:\aaa zzz \file with spaces# test to verify if the regex "C:\ aaa zzz\file with spaces.txt # works correctly " C:\aaa zzz \file with spaces# test to verify if the regex "C:\ aaa zzz\file with spaces.txt # works correctly This is a " C:\aaa zzz \file with spaces# test to verify if the regex "C:\ aaa zzz\file with spaces.txt # " C:\aaa zzz \file with spaces# test to verify if the regex "C:\ aaa zzz\file with spaces.txt # This is a " file with spaces# test to verify if the regex "file with spaces.txt # works correctly " file with spaces# test to verify if the regex "file with spaces.txt # works correctly This is a " file with spaces# test to verify if the regex "file with spaces.txt # " file with spaces# test to verify if the regex "file with spaces.txt # This is a " C:\aaa zzz \file with spaces# test to verify if the regex "C:\ aaa zzz\file with spaces.txt works correctly
Now, here is the regex S/R, which changes any
space
char into an underscore character (_
) in zones".......#
and which changes, back, the temporary#
character into the usual double-quote ("
) :SEARCH
(?-s)([^"#\r\n]*")\x20*|\x20*(#)|(\x20)+(?=.+#)
REPLACE
(?1\1)(?2")(?3_)
We get our expected text :
This is a "C:\aaa_zzz_\fil_with_spaces" test to verify if the regex "C:\_aaa_zzz\file_with_spaces.txt" works correctly "C:\aaa_zzz_\fil_with_spaces" test to verify if the regex "C:\_aaa_zzz\file_with_spaces.txt" works correctly This is a "C:\aaa_zzz_\fil_with_spaces" test to verify if the regex "C:\_aaa_zzz\file_with_spaces.txt" "C:\aaa_zzz_\fil_with_spaces" test to verify if the regex "C:\_aaa_zzz\file_with_spaces.txt" This is a "fil_with_spaces" test to verify if the regex "file_with_spaces.txt" works correctly "fil_with_spaces" test to verify if the regex "file_with_spaces.txt" works correctly This is a "fil_with_spaces" test to verify if the regex "file_with_spaces.txt" "fil_with_spaces" test to verify if the regex "file_with_spaces.txt" This is a "C:\aaa_zzz_\fil_with_spaces" test to verify if the regex "C:\ aaa zzz\file with spaces.txt works correctly
Note that, in the last test line, the last
#
ending delimiter is missing. So, allspace
chars ,after the last unpaired"
character, are obviously not modified !Best Regards
guy038
P.S. :
As, when a file is renamed, any leading or ending
space
char, after the extension, are omitted from the final name, then the regex changes the text :" C:\xxx\name with spaces.txt "
as :
"C:\xxx\name_with_spaces.txt"
-
@guy038 said:
I’ve been thinking about the general problem of doing a replacement, ONLY between zones delimited with…
So this is great.
I started archiving it in my special “notes” file as a general-purpose solution.But…
Then I realized it “goes too far” and perhaps its general-purposeness is destroyed.
I’m talking about theP.S.
part above.
I’d have much rather have seen"____C:\xxx\name with spaces.txt____"
as the result (keeps the general-purposeness).But…
I’m also curious as to why you would solve a problem that the OP didn’t seem to have.
OP said nothing about leading/trailing spaces needing to be removed.
Of course, the OP was really light on sample data, but…Or am I missing something?
-
Hi, @mugsys-rapSheet, @Peterjones, @alan-kilbron and All,
Well Alan, I just built up the regex that way, because it seems to be the default Windows rename behavior ;-) Indeed, on my
Win XP
laptop, if I try to rename, for instance, theTest.txt
file, hitting theF2
key, into" Test.txt "
, Windows does not allow this name change !Now, I realized that, in a
DOS
console window, the command :dir > " Test.txt "
does create the file :
Test.txt
with
3
spaces before the word Test, but without any space char after the extension.txt
!So, could you verify on your newer Windows OS ? And , of course, an alternate regex is always possible ;-))
BR
guy038
-
@guy038 ,
I see the same behavior on Win10 command (with quotes around the spaces, the leading spaces are preserved, but trailing spaces are eliminated)
-
How about “Replace text1 with text2 only between delimiter1 and delimiter2”, where delimiter1 and delimiter2 could be the same.
No other assumptions. Meaning, for one thing, that the data doesn’t have to all be on one line.
Maybe that is too much for regex?
(Oooh, a challenge for regex…) -
Hi, @mugsys-rapSheet, @alan-kilborn and All,
First, I’m just adapting my previous regex S/R to @mugsys-rapSheet needs, which concern absolute /relative paths to files :
-
Now, each single space char is replaced with a single
_
char ( instead of several spaces into1
underscore ) -
As a filename can contain leading spaces only and, as an absolute path, obviously cannot begin with
space
characters
This leads to this regex S/R :
SEARCH
(?-s)([^"#\r\n]*")(?:\x20*(?=\u:))?|\x20*(#)|(\x20)(?=.+#)
REPLACE
(?1\1)(?2")(?3_)
So from an initial text :
A test to " C:\aaa zzz\ file with spaces .txt # verify the regex " C:\aaa zzz\ file with spaces .txt # behavior A test to " file with spaces .txt # verify the regex " file with spaces .txt # behavior
we would end with :
A test to "C:\aaa______zzz\_____file_with____spaces__.txt" verify the regex "C:\aaa______zzz\_____file_with____spaces__.txt" behavior A test to "_____file_with____spaces__.txt" verify the regex "_____file_with____spaces__.txt" behavior
Alan, do you find this replacement more acceptable ?
Now, regarding the regex challenge, I don’t think that multi-lines search, with the
(?s)
modifier, would be a problem !But, of course, we would assume that :
-
Any range with identical opening and ending delimiter ( mostly the
'
and"
characters ) would have been previously changed into an oriented range, like, for instance,".........#
and'.........#
! -
The search would not simultaneously search for two or more ranges ( for instance, ranges
".........#
and{..........}
) Too tricky ;-))
Let me get to the bottom of this ;-)) I just hope not to be long !
Best regards,
guy038
-
-
@guy038 said in HELP! Replace spaces in text ONLY between quotes?:
But, of course, we would assume that…
Yes, those two assumption points are fine!
One thing I forgot to mention earlier is that perhaps text1, text2, delimiter1 and delimiter2 could all be multiple-characters in length (maybe not so obvious for the delimiters?)
-
@guy038 said in HELP! Replace spaces in text ONLY between quotes?:
Alan, do you find this replacement more acceptable ?
I didn’t really mean to “downgrade” the original solution.
It probably helped the OP.
I was just disappointed that it “went too far” so that the general problem solution (that we are now discussing) may have been obscured in the special-purpose result.
But…we’re on the right track now. :-) -
Hi, @alan-kilobrn and All,
Well, here it is ! I found a regex structure and some variants, which do the trick very nicely ;-))
Let’s define these
4
variables :-
Srch_Expr =
Char|String|Regex
-
Repl_Text =
Char|String|Regex
-
Opn_Del =
Char
-
End_Del =
Char
Then the
Search_Replacement_A
, which matches any Srch_Expr, ONLY WHEN located between the Opn_Del and the End_Del delimiter(s), is :-
SEARCH Srch_Expr
(?=[^
Opn_Del]*?
End_Del)
-
REPLACE Repl_Text
Rules and Remarks :
-
The Opn_Del and End_Del delimiters represent, both, a single char, which may be :
-
Characters already present of the current file
-
Characters presently absent in current file and manually added by the user ( Verify with the
Count
feature they are new chars ! )
-
-
In case, of added delimiters, use, preferably, the
Search_Replacement_B
, below, which deletes these temporary delimiters, at the same time :-
SEARCH Opn_Del
|(
Srch_Expr)(?=[^
Opn_Del]*?
End_Del)|
End_Del -
REPLACE
?1
Repl_Text
-
-
The Opn_Del and End_Del delimiters must not be identical. If NOT ( usually, cases
"...."
or'....'
), perform the following regex S/R,first
:-
SEARCH Delim
(.+?)
Delim -
REPLACE Opn_Del
\1
End_Del
-
Then, execute the
Search_Replacement_C
, which rewrites, as well, the initial double delimiters Delim :-
SEARCH Opn_Del
|(
Srch_Expr)(?=[^
Opn_Del]*?
End_Del)|
End_Del -
REPLACE
?1
Repl_Text:
Delim
-
-
-
In case that the last zone Opn_Del
......
End_Del ends at the very end of current file, the End_Del delimiter, of the last zone, can be omitted and you’ll use theSearch_Replacement_D
, below :-
SEARCH Srch_Expr
(?=[^
Opn_Del]*?(
End_Del|\z))
-
REPLACE Repl_Text
-
-
The different zones, between delimiters, must be complete, with their matched different delimiters ( Open_Del
......
End_Del ) -
The different zones, between delimiters, may be juxtaposed but NOT nested
-
A zone can lie in a single line or split over several lines
-
Srch_Expr must not match the Opn_Del delimiter, or part of it
As an example, let’s use the
license.txt
file and define :-
Opn_Del
&
-
End_Del
#
-
Srch_Expr
\w+
I previously verified that, both, these two symbols do not exist in the
license.txt
file- Place one or several zones
&......#
, anywhere in this file. The zones may be split on several lines, with the OplDel&
in a line and an End_Del#
on a further line
So, the appropriate search
regex_A
is :SEARCH
\w+(?=[^&]*?#)
And imagine that we want to surround any word with strings
[--
and--]
, in zones&............#
, ONLY. Hence, the replaceRegex_A
is :REPLACE
[--$0--]
With the same delimiters, let’s suppose that we want, this time, to find any single standard char, so the regex
(?-s).
and to replace it with an underscore_
charHowever, regarding the rules above, this regex should not match the Opn_Del delimiter. This can be achieved with the regex
(?-s)(?!&).
, giving the followingSearch_Replacement_A
:SEARCH
(?-s)(?!&).(?=[^&]*?#)
REPLACE
_
Now, Alan I also found a regex which works when delimiters are simple words as for instance, STT and END and not single characters :
Regex_E
= Srch_Expr(?=(?s-i:(?:(?!
Opn_Del).)*?
End_Del))
Unfortunately, due to the negative look-ahead syntax,
(?!
Opn_Del).)
, this regex bugges, even with the light license.txt file :-(( You know : the final wrong “all contents” match ! But it works nice when few lines are involved !The safe solution, in that case, is to replace word delimiters with char delimiters, first ( for instance STT ->
&
and END ->#
)
Finally, a possible @mugsys-rapSheet’s solution, much shorter, using the generic
Search_Replacement_C
, could be :SEARCH
&|(\x20)(?=[^&]*?#)|#
REPLACE
?1_:"
Of course, assuming this text :
& C:\xxx\name with spaces.txt #
Then, the leading and trailing spaces, around the absolute paths, would not be deleted anymore but just replaced with
_
characters. Thus, the final text would be :"____C:\xxx\name_with_spaces.txt____"
Best regards,
guy038
-