Community
    • Login

    HELP! Replace spaces in text ONLY between quotes?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 4 Posters 4.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mugsys RapSheetM
      Mugsys RapSheet @PeterJones
      last edited by

      @PeterJones (Quick clarification for anyone reading this is the future:

      "folderA/folderB/“file name with spaces”;

      Should be:

      "folderA/folderB/file name with spaces”;

      (The extra quote was a typing mistake.)

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello, @ugsys-rapSheet, @Peterjones and All,

        I’ve been thinking about the general problem of doing a replacement, ONLY between zones delimited with double-quotes ( "......" )

        The difficulty comes from the fact that the 2 delimiters are identical. So in order to help, we must, first, change, temporarily, the " ending delimiter into an other character, not used yet in current file. I chose the # character but any other character, even a control char, would be fine if not already present in current file.

        So, assuming the test sample, below :

            This is a "     C:\aaa zzz \file with spaces" test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    "   works correctly
        "     C:\aaa zzz \file with spaces" test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    "   works correctly
        This is a "     C:\aaa zzz \file with spaces" test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    "
        "     C:\aaa zzz \file with spaces" test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    "
        
        This is a "     file with spaces" test to verify    if the regex "file with spaces.txt    "   works correctly        
        "     file with spaces" test to verify    if the regex "file with spaces.txt    "   works correctly
        This is a "     file with spaces" test to verify    if the regex "file with spaces.txt    "
        "     file with spaces" test to verify    if the regex "file with spaces.txt    "
        
        This is a "     C:\aaa zzz \file with spaces" test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    "   works correctly
        

        The first regex S/R is, obviously :

        SEARCH "(.+?)"

        REPLACE "\1#

        And we would end with :

            This is a "     C:\aaa zzz \file with spaces# test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    #   works correctly
        "     C:\aaa zzz \file with spaces# test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    #   works correctly
        This is a "     C:\aaa zzz \file with spaces# test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    #
        "     C:\aaa zzz \file with spaces# test to verify    if the regex "C:\ aaa zzz\file with spaces.txt    #
        
        This is a "     file with spaces# test to verify    if the regex "file with spaces.txt    #   works correctly        
        "     file with spaces# test to verify    if the regex "file with spaces.txt    #   works correctly
        This is a "     file with spaces# test to verify    if the regex "file with spaces.txt    #
        "     file with spaces# test to verify    if the regex "file with spaces.txt    #
        
        This is a "     C:\aaa zzz \file with spaces# test to verify    if the regex "C:\ aaa zzz\file with spaces.txt      works correctly
        

        Now, here is the regex S/R, which changes any space char into an underscore character ( _ ) in zones ".......# and which changes, back, the temporary # character into the usual double-quote ( " ) :

        SEARCH (?-s)([^"#\r\n]*")\x20*|\x20*(#)|(\x20)+(?=.+#)

        REPLACE (?1\1)(?2")(?3_)

        We get our expected text :

            This is a "C:\aaa_zzz_\fil_with_spaces" test to verify    if the regex "C:\_aaa_zzz\file_with_spaces.txt"   works correctly
        "C:\aaa_zzz_\fil_with_spaces" test to verify    if the regex "C:\_aaa_zzz\file_with_spaces.txt"   works correctly
        This is a "C:\aaa_zzz_\fil_with_spaces" test to verify    if the regex "C:\_aaa_zzz\file_with_spaces.txt"
        "C:\aaa_zzz_\fil_with_spaces" test to verify    if the regex "C:\_aaa_zzz\file_with_spaces.txt"
        
        This is a "fil_with_spaces" test to verify    if the regex "file_with_spaces.txt"   works correctly        
        "fil_with_spaces" test to verify    if the regex "file_with_spaces.txt"   works correctly
        This is a "fil_with_spaces" test to verify    if the regex "file_with_spaces.txt"
        "fil_with_spaces" test to verify    if the regex "file_with_spaces.txt"
        
        This is a "C:\aaa_zzz_\fil_with_spaces" test to verify    if the regex "C:\ aaa zzz\file with spaces.txt      works correctly
        

        Note that, in the last test line, the last # ending delimiter is missing. So, all space chars ,after the last unpaired " character, are obviously not modified !

        Best Regards

        guy038

        P.S. :

        As, when a file is renamed, any leading or ending space char, after the extension, are omitted from the final name, then the regex changes the text :

        "    C:\xxx\name with spaces.txt    "
        

        as :

        "C:\xxx\name_with_spaces.txt"
        
        Alan KilbornA 1 Reply Last reply Reply Quote 1
        • Alan KilbornA
          Alan Kilborn @guy038
          last edited by

          @guy038 said:

          I’ve been thinking about the general problem of doing a replacement, ONLY between zones delimited with…

          So this is great.
          I started archiving it in my special “notes” file as a general-purpose solution.

          But…
          Then I realized it “goes too far” and perhaps its general-purposeness is destroyed.
          I’m talking about the P.S. part above.
          I’d have much rather have seen "____C:\xxx\name with spaces.txt____" as the result (keeps the general-purposeness).

          But…
          I’m also curious as to why you would solve a problem that the OP didn’t seem to have.
          OP said nothing about leading/trailing spaces needing to be removed.
          Of course, the OP was really light on sample data, but…

          Or am I missing something?

          1 Reply Last reply Reply Quote 1
          • guy038G
            guy038
            last edited by guy038

            Hi, @mugsys-rapSheet, @Peterjones, @alan-kilbron and All,

            Well Alan, I just built up the regex that way, because it seems to be the default Windows rename behavior ;-) Indeed, on my Win XP laptop, if I try to rename, for instance, the Test.txt file, hitting the F2 key, into " Test.txt ", Windows does not allow this name change !

            Now, I realized that, in a DOS console window, the command :

            dir > "   Test.txt   "
            

            does create the file :

               Test.txt
            

            with 3 spaces before the word Test, but without any space char after the extension .txt !

            So, could you verify on your newer Windows OS ? And , of course, an alternate regex is always possible ;-))

            BR

            guy038

            PeterJonesP Alan KilbornA 2 Replies Last reply Reply Quote 1
            • PeterJonesP
              PeterJones @guy038
              last edited by

              @guy038 ,

              I see the same behavior on Win10 command (with quotes around the spaces, the leading spaces are preserved, but trailing spaces are eliminated)

              1 Reply Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @guy038
                last edited by

                @guy038

                How about “Replace text1 with text2 only between delimiter1 and delimiter2”, where delimiter1 and delimiter2 could be the same.

                No other assumptions. Meaning, for one thing, that the data doesn’t have to all be on one line.

                Maybe that is too much for regex?
                (Oooh, a challenge for regex…)

                1 Reply Last reply Reply Quote 2
                • guy038G
                  guy038
                  last edited by

                  Hi, @mugsys-rapSheet, @alan-kilborn and All,

                  First, I’m just adapting my previous regex S/R to @mugsys-rapSheet needs, which concern absolute /relative paths to files :

                  • Now, each single space char is replaced with a single _ char ( instead of several spaces into 1 underscore )

                  • As a filename can contain leading spaces only and, as an absolute path, obviously cannot begin with space characters

                  This leads to this regex S/R :

                  SEARCH (?-s)([^"#\r\n]*")(?:\x20*(?=\u:))?|\x20*(#)|(\x20)(?=.+#)

                  REPLACE (?1\1)(?2")(?3_)

                  So from an initial text :

                  A test to "    C:\aaa      zzz\     file with    spaces  .txt     #  verify the regex   "    C:\aaa      zzz\     file with    spaces  .txt     #    behavior
                  
                  A test to "     file with    spaces  .txt     #  verify the regex   "     file with    spaces  .txt     #    behavior
                  

                  we would end with :

                  A test to "C:\aaa______zzz\_____file_with____spaces__.txt"  verify the regex   "C:\aaa______zzz\_____file_with____spaces__.txt"    behavior
                  
                  A test to "_____file_with____spaces__.txt"  verify the regex   "_____file_with____spaces__.txt"    behavior
                  

                  Alan, do you find this replacement more acceptable ?


                  Now, regarding the regex challenge, I don’t think that multi-lines search, with the (?s) modifier, would be a problem !

                  But, of course, we would assume that :

                  • Any range with identical opening and ending delimiter ( mostly the ' and " characters ) would have been previously changed into an oriented range, like, for instance, ".........# and '.........# !

                  • The search would not simultaneously search for two or more ranges ( for instance, ranges ".........# and {..........} ) Too tricky ;-))

                  Let me get to the bottom of this ;-)) I just hope not to be long !

                  Best regards,

                  guy038

                  Alan KilbornA 2 Replies Last reply Reply Quote 1
                  • Alan KilbornA
                    Alan Kilborn @guy038
                    last edited by

                    @guy038 said in HELP! Replace spaces in text ONLY between quotes?:

                    But, of course, we would assume that…

                    Yes, those two assumption points are fine!

                    One thing I forgot to mention earlier is that perhaps text1, text2, delimiter1 and delimiter2 could all be multiple-characters in length (maybe not so obvious for the delimiters?)

                    1 Reply Last reply Reply Quote 1
                    • Alan KilbornA
                      Alan Kilborn @guy038
                      last edited by

                      @guy038 said in HELP! Replace spaces in text ONLY between quotes?:

                      Alan, do you find this replacement more acceptable ?

                      I didn’t really mean to “downgrade” the original solution.
                      It probably helped the OP.
                      I was just disappointed that it “went too far” so that the general problem solution (that we are now discussing) may have been obscured in the special-purpose result.
                      But…we’re on the right track now. :-)

                      1 Reply Last reply Reply Quote 0
                      • guy038G
                        guy038
                        last edited by guy038

                        Hi, @alan-kilobrn and All,

                        Well, here it is ! I found a regex structure and some variants, which do the trick very nicely ;-))

                        Let’s define these 4 variables :

                        • Srch_Expr = Char|String|Regex

                        • Repl_Text = Char|String|Regex

                        • Opn_Del = Char

                        • End_Del = Char

                        Then the Search_Replacement_A, which matches any Srch_Expr, ONLY WHEN located between the Opn_Del and the End_Del delimiter(s), is :

                        • SEARCH Srch_Expr(?=[^Opn_Del]*?End_Del)

                        • REPLACE Repl_Text


                        Rules and Remarks :

                        • The Opn_Del and End_Del delimiters represent, both, a single char, which may be :

                          • Characters already present of the current file

                          • Characters presently absent in current file and manually added by the user ( Verify with the Count feature they are new chars ! )


                        • In case, of added delimiters, use, preferably, the Search_Replacement_B, below, which deletes these temporary delimiters, at the same time :

                          • SEARCH Opn_Del|(Srch_Expr)(?=[^Opn_Del]*?End_Del)|End_Del

                          • REPLACE ?1Repl_Text


                        • The Opn_Del and End_Del delimiters must not be identical. If NOT ( usually, cases "...." or '....' ), perform the following regex S/R, first :

                          • SEARCH Delim(.+?)Delim

                          • REPLACE Opn_Del\1End_Del

                          • Then, execute the Search_Replacement_C, which rewrites, as well, the initial double delimiters Delim :

                            • SEARCH Opn_Del|(Srch_Expr)(?=[^Opn_Del]*?End_Del)|End_Del

                            • REPLACE ?1Repl_Text:Delim


                        • In case that the last zone Opn_Del......End_Del ends at the very end of current file, the End_Del delimiter, of the last zone, can be omitted and you’ll use the Search_Replacement_D, below :

                          • SEARCH Srch_Expr(?=[^Opn_Del]*?(End_Del|\z))

                          • REPLACE Repl_Text


                        • The different zones, between delimiters, must be complete, with their matched different delimiters ( Open_Del......End_Del )

                        • The different zones, between delimiters, may be juxtaposed but NOT nested

                        • A zone can lie in a single line or split over several lines

                        • Srch_Expr must not match the Opn_Del delimiter, or part of it


                        As an example, let’s use the license.txt file and define :

                        • Opn_Del &

                        • End_Del #

                        • Srch_Expr \w+

                        I previously verified that, both, these two symbols do not exist in the license.txt file

                        • Place one or several zones &......#, anywhere in this file. The zones may be split on several lines, with the OplDel & in a line and an End_Del # on a further line

                        So, the appropriate search regex_A is :

                        SEARCH \w+(?=[^&]*?#)

                        And imagine that we want to surround any word with strings [-- and --], in zones &............#, ONLY. Hence, the replace Regex_A is :

                        REPLACE [--$0--]


                        With the same delimiters, let’s suppose that we want, this time, to find any single standard char, so the regex (?-s). and to replace it with an underscore _ char

                        However, regarding the rules above, this regex should not match the Opn_Del delimiter. This can be achieved with the regex (?-s)(?!&)., giving the following Search_Replacement_A :

                        SEARCH (?-s)(?!&).(?=[^&]*?#)

                        REPLACE _


                        Now, Alan I also found a regex which works when delimiters are simple words as for instance, STT and END and not single characters :

                        • Regex_E = Srch_Expr(?=(?s-i:(?:(?!Opn_Del).)*?End_Del))

                        Unfortunately, due to the negative look-ahead syntax, (?!Opn_Del).), this regex bugges, even with the light license.txt file :-(( You know : the final wrong “all contents” match ! But it works nice when few lines are involved !

                        The safe solution, in that case, is to replace word delimiters with char delimiters, first ( for instance STT -> & and END -> # )


                        Finally, a possible @mugsys-rapSheet’s solution, much shorter, using the generic Search_Replacement_C, could be :

                        SEARCH &|(\x20)(?=[^&]*?#)|#

                        REPLACE ?1_:"

                        Of course, assuming this text :

                        &    C:\xxx\name with spaces.txt    #
                        

                        Then, the leading and trailing spaces, around the absolute paths, would not be deleted anymore but just replaced with _ characters. Thus, the final text would be :

                        "____C:\xxx\name_with_spaces.txt____"
                        

                        Best regards,

                        guy038

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors