Community
    • Login

    Help replacing spaces between wildcards between quotes

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    8 Posts 3 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • LookMAAAnohandsL
      LookMAAAnohands
      last edited by

      Fellow Notepad++ Users,

      Could you please help me the the following search-and-replace problem I am having?

      I have a large dataset of names I’m trying to organize, the middle process is the most problematic, I’m trying to replace all the spaces in-between wildcard texts within quotes with something else, so that I can later replace them again. But there’s a large variety of combinations of spaces and dashes and numbers of words between quotes.

      Here is the data I currently have (“before” data):

      "text text"
      "text text" "text-text" "text text" "text-text"
       "text text-text"  "text-text text" 
      "text text-text text-text"
      "text-text text text-text"
      "text text-text text-text text"
      ...
      etc
      

      Here is how I would like that data to look (“after” data):

      "text=text"
      "text=text" "text-text" "text=text" "text-text"
       "text=text-text"  "text-text text" 
      "text=text-text=text-text"
      "text-text=text=text-text"
      "text=text-text=text-text text"
      

      To accomplish this, I have tried using the following Find/Replace expressions and settings

      Find What = `"(\w+)\h(\w+)"`
      Replace With = `"(\1)=(\2)"`
      Search Mode = REGULAR EXPRESSION
      Dot Matches Newline = CHECKED
      

      This does work, except I then have to search and replace every possible combinations to get them all.

      Find What = "(\w+)-(\w+)\h(\w+)"
       Replace With = "(\1)-(\2)=(\3)"
      Find What = "(\w+)\h(\w+)-(\w+)"
       Replace With = "(\1)=(\2)-(\3)"
      Find What = "(\w+)\h(\w+)\h(\w+)-(\w+)"
       Replace With = "(\1)=(\2)=(\3)-(\4)"
      Find What = "(\w+)\h(\w+)-(\w+)-(\w+)"
       Replace With = "(\1)=(\2)-(\3)-(\4)"
      Find What = "(\w+)-(\w+)\h(\w+)-(\w+)"
       Replace With = "(\1-(\2)=(\3)-(\4)"
      ...
      etc
      
      

      I couldn’t figure out how to get the logic to work with regular expression unfortunately: to search and replace any numbers of spaces between any numbers of texts, but only when it’s between quotes, and not replaces dashes…

      Any help will be immensely helpful.

      Cheers.

      PeterJonesP 1 Reply Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @LookMAAAnohands
        last edited by

        @lookmaaanohands

        Thanks for showing what you already tried, and showing before/after data. That’s helpful.

        FIND="([\w=-]+)\h([\w\h-]+)"
        Same replacement as you used (or I phrase it as "${1}=${2}")

        You may have to do Replace All multiple times to get them all, but you don’t have to change the expression.

        In the first “term”, i search for word characters (alpha, numeric, underscore) or equals-sign (to allow it to do the multiple joins like you showed in the later examples) or the minus-sign (to allow text-text). In the second term, allow only word, hyphen, or horizontal spaces (that way, it handles the 3-or-more-word quoted terms correctly).

        This also assumes that your last line in the final result

        "text=text-text=text-text text"
        

        was actually intended to be

        "text=text-text=text-text=text"
        

        Because you didn’t give a rule that would have left any spaces in your problem description (which sounded like you wanted all spaces between the quotes to become equals).

        You could also use the generic regex for “Replacing in a specific zone of text” which is linked in our generic regex FAQ, which could accomplish the same thing in a single Replace All. The beginning and ending expressions (BSR and ESR) would just be simple quote marks, the find-expression (FR) would be a space or the \h equivalent, and the replacement (RR) would be the equals-sign. (I would have suggested that first, but since you’d already put in the work for your custom expression, so I thoguht I’d give you the tailored version)

        LookMAAAnohandsL 1 Reply Last reply Reply Quote 2
        • LookMAAAnohandsL
          LookMAAAnohands @PeterJones
          last edited by

          @peterjones you know… I’ve never used notepad++'s regex before, and i felt like i just delved into the deep end totally by accident. Spent a good few days trying to learn coding from scratch, and still couldn’t figure it out. Thanks a bunch man, this works, and helps me a lot with my work. Cheers.
          (I will keep at learning it, so hopefully i can help someone else next time)

          1 Reply Last reply Reply Quote 2
          • Alan KilbornA
            Alan Kilborn
            last edited by

            I’m wondering if the typical replace-but-only-between-delimiters technique can be adapted for this case, i.e., when the beginning and ending delimiter is the same?

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @Alan Kilborn
              last edited by

              @alan-kilborn said in Help replacing spaces between wildcards between quotes:

              I’m wondering if the typical replace-but-only-between-delimiters technique can be adapted for this case

              A simple trial of the referenced technique:

              Find: (?-si:"|(?!\A)\G)(?s-i:(?!").)*?\K(?-si:\h)
              Repl: =

              Yields this result on the OP’s original text:

              "text=text"
              "text=text"="text-text"="text=text"="text-text"
              ="text=text-text"=="text-text=text"=
              "text=text-text=text-text"
              "text-text=text=text-text"
              "text=text-text=text-text=text"
              

              Which is not what was wanted; compare of desired (left side) and actual result:

              9412f0c3-1eac-4703-ac83-f1c362a3bbc8-image.png

              PeterJonesP 1 Reply Last reply Reply Quote 0
              • PeterJonesP
                PeterJones @Alan Kilborn
                last edited by

                @alan-kilborn ,

                Interesting. Yeah, the generic regex doesn’t currently work when the BSR and ESR are the same string. Something would need to be done to the expression to make sure it’s always between balanced pairs, rather than between any two instances of the single wrapping character.

                Changing to single-line matching for the ESR (…(?-si:(?!").)*?…) will fix the equal at the beginning of line 3… but the other differences on 2-3 will not be fixed. (The difference in line 6 is, I assume, a mistake on the part of the OP, as I said in my earlier reply, because the rules defined say to replace all the spaces between the pairs of quotes, and line 6 in the example missed one.)

                For this specific instance, I would be tempted to do a three-step regex sequence: on the first, change pairs of "..." to “…”, then use “ as BSR and ” as ESR, then convert “…” back to "...". That wouldn’t work in the general case, because of course the BSR==ESR might not always be ASCII quotes. But for this specific instance, my tailored regex is conceptually easier for me to understand, so that’s what I’d actually use.

                I think @guy038’s homework should be to come up with the equivalent generic syntax for when BSR==ESR. (I cannot easily think of a way to “consume” the ESR after all the FR have been replaced between the previous BSR and that ESR). Once it’s been vetted, he can add it as a follow on to the official page ;-)

                Alan KilbornA 1 Reply Last reply Reply Quote 0
                • Alan KilbornA
                  Alan Kilborn @PeterJones
                  last edited by

                  @peterjones said in Help replacing spaces between wildcards between quotes:

                  I think @guy038’s homework should be to come up with the equivalent generic syntax for when BSR==ESR. (I cannot easily think of a way to “consume” the ESR after all the FR have been replaced between the previous BSR and that ESR). Once it’s been vetted, he can add it as a follow on to the official page

                  Well, yes, our course that is the ideal thing to happen next. :-)

                  1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn
                    last edited by

                    But perhaps it is not reasonable to try to wedge a problem like this into the BSR/ESR solution. It has come up before, I just found THIS and that solution also uses what Peter suggests (change identical delimiter such that both delimiters are no longer identical).

                    1 Reply Last reply Reply Quote 1
                    • First post
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors