Community
    • Login

    How can I change all the words in a given structure?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    29 Posts 5 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • darkenbD
      darkenb @guy038
      last edited by

      @guy038
      My sentence design is B1, but I want to switch to A2 and then B1 again. Although I have written A2 layout, the system puts A1 in order. So it lifts the lines, I don’t understand it.

      1 Reply Last reply Reply Quote 0
      • darkenbD
        darkenb @Alan Kilborn
        last edited by

        @Alan-Kilborn I understand you, but I am seriously a newbie. (? -i: bs (_ | (?! \ A) \ G) (? s: (?! _ bs)).) *? \ K (? - i :) this may be simple for you, but I’m seriously confused. So I couldn’t delete the underscores and replace them with a space character without breaking the sentence. Simply write the codes and let me apply them.

        Alan KilbornA 1 Reply Last reply Reply Quote 0
        • Alan KilbornA
          Alan Kilborn @darkenb
          last edited by

          @darkenb said in How can I change all the words in a given structure?:

          this may be simple for you

          Actually, it is NOT simple for me, meaning that I couldn’t retype it from nothing and have a chance at getting it correct. But that’s the benefit of a plug and play recipe.

          1 Reply Last reply Reply Quote 0
          • guy038G
            guy038
            last edited by

            Hi, @darkenb,

            OK, I’ve found out all the regexes needed to cover, both, the changes from B1 to A2 styles and then, from A2 to B1 styles again !

            Just allow me an hour about to write an informative reply ( for you ! )

            BR

            guy038

            darkenbD Alan KilbornA 2 Replies Last reply Reply Quote 1
            • darkenbD
              darkenb @guy038
              last edited by

              @guy038

              Thank you. I’m waiting …

              1 Reply Last reply Reply Quote -1
              • Alan KilbornA
                Alan Kilborn @guy038
                last edited by

                @guy038 said in How can I change all the words in a given structure?:

                Just allow me an hour about to write an informative reply ( for you ! )

                @darkenb

                Thank you. I’m waiting …

                LOL, all the information was there, a day and a half ago.
                What problem is left to be solved?

                1 Reply Last reply Reply Quote 1
                • guy038G
                  guy038
                  last edited by guy038

                  Hello, @darkenb, @alan-kilborn, @peterjones, @ekopalypse and All,

                  @darkenb, before speaking about your problem, I would like to give you some basic information about generic regexes !

                  Let’s imagine that you want to change a certain class of characters, surrounded double quotes into the same class but surrounded by other symbols

                  I could have written this generic regex :

                  SEARCH "FR"

                  REPLACE SS$0ES

                  So,

                  • If you have a lot of digits between double quotes, the Find Regex FR is \d+ and that you want to surround them with square brackets, the Start Separator SS is [ symbol and the End Separator ES is the ] symbol So, you would use the regex S/R below :

                    • SEARCH "\d+"

                    • REPLACE [$0]

                  • Now, if you have a lot of uppercase letters between double quotes, the Find Regex FR is [A-Z]+ and you want to surround them with braces, themselves surrounded with the -- string, the Start Separator SS is --{ symbol and the End Separator ES is the }-- symbol and you would use the regex S/R below :

                    • SEARCH "[A-Z]+"

                    • REPLACE --{$0}--

                  • Finally, if you have a lot of word letters between double quotes, the Find Regex FR is \w+ and you want to surround them with one space char, themselves surrounded with the simple quotes, the Start Separator SS is '\x20 symbol and the End Separator ES is the \x20' symbol and you would use the regex S/R below :

                    • SEARCH "\w+"

                    • REPLACE '\x20$0\x20'

                  Note that the $0 syntax, in replacement, represents the complete search match and \x20 represents a single space char

                  So, you can see, that whatever the real example chosen, this generic regex remains exact and means :

                  After the replacement, any range of characters between double quotes will be changed as the same range, preceded with the SS separator and followed with the ES separator

                  Of course, this example is very basic and should be wrong in some particular cases but just gives you a general idea ! The goal is to replace the generic names, as FR, SS and ES with their true regex values, regarding your own needs and what you want to achieve ;-))


                  Let’s go back to your problem ! For all the regexes, provided below, the process is :

                  • Open or switch to your file, in N++

                  • Open the Replace dialog ( Ctrl + H )

                    • Fill up the Find what: and Replace with: zones with the appropriate regexes

                    • Un-tick all box options, first

                    • Tick the Wrap around option ( IMPORTANT : this ensures that current file is scanned from its very beginning to its very end, whatever the current position of the caret )

                    • Select the Regular expression search mode

                    • Click on the Replace All button ( Do not use the Replace button, due to the possible \K syntax in regexes )

                  • In addition note that the square brackets are special regex characters with a special meaning and need to be escaped when you want to search them as literals. However, unfortunately, this escape syntax is not properly displayed, on our NodeBB forum. So I’m going to use the usual \x## syntax, where ## represents the hexadecimal code of a character. So, in regexes, I will refer of the [ as the \x5b char and of the ] as the \x5d char !

                  First, I will provide the method and the different regexes needed. Secondly, I’ll give you some explanations on them. However, I strongly advice you to learn basic regex documentation from here ;-))


                  • A) This first regex S/R will add an underscore right after any [ character and right before any ] character

                    • SEARCH (\x5b)|\x5d

                    • REPLACE ?1\x5b_:_\x5d

                  • B) Then, this second regex S/R will change any space char, within square brackets only, with an underscore char :

                    • SEARCH (?-s)(?:\x5b_|(?!\A)\G)(?:(?!_\x5d).)*?\K\x20

                    • REPLACE _

                  • Now, just translate all your text with Google Translate

                  ... ... ....
                  ... ... ....
                  ... ... ....
                  
                  • C) Once this translation task over, this third regex S/R will remove the underscore char located after the [ character and before the ] character

                    • SEARCH (\x5b)_|_\x5d

                    • REPLACE ?1\x5b:\x5d

                  • D Finally, this fourth regex S/R, below, will change back any underscore character , within square brackets only, with an space char :

                    • SEARCH (?-s)(?:\x5b|(?!\A)\G)(?:(?!\x5d).)*?\K_

                    • REPLACE \x20

                  Et voilà !


                  Notes :

                  • Regarding the S/R A and C :

                    • The search part is rather obvious and searches two different expressions, separated with the alternation symbol |

                    • Note that the \x5b character is surrounded with parentheses and so, defines a group 1 which is re-used in replacement

                    • The replacement has the syntax ?1(True:False) which means :

                      • If group 1 exists ( so when the first alternative \x5b occurs ) rewrite the True part

                      • If group 1 does not exist ( so, when the second alternative \x5d occurs ) rewrite the False part

                  • Regarding the S/R B and D :

                    • They are, both, built up from the generic S/R regex :

                      • SEARCH (?s-i:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\K(?s-i:FR)

                      • REPLACE RR

                    • Note that the different syntaxes (?s-i:•••••) are non-capturing groups ( i.e. groups which do not store the contents between parentheses ), which contain the leading modifiers s and i

                      • The s modifier means that the dot regex char ( . ) represents any char, even EOL characters

                      • The -i modifier means that the search is sensible to case of letters characters

                    • However, as we do not search any letter and as I suppose that your different zones [•••••••] stand all in a single line, this generic regex can be simplified as below, with a leading -s modifier, meaning that a . will match a single standard character, only

                      • SEARCH (?-s)(?:BSR|(?!\A)\G)(?:(?!ESR).)*?\KFR

                      • REPLACE RR

                    • Globally, the generic S/R, above will change any search expression, found with the FR regex, with the replacement expression, expressed with the RR syntax, between the BSR and ESR excluded locations, only !

                    • So, for the regex S/R B :

                      • BSR, Beginning Search-region Regex, is the regex \x5b_

                      • ESR, Ending Search-region Regex, is the regex _\x5d

                      • FR, Find Regex, is the regex \x20

                      • RR, Replacement Regex is the regex _

                    • And, if we change the names of generic regex with the real regex values, we exactly get the search regex (?-s)(?:\x5b_|(?!\A)\G)(?:(?!_\x5d).)*?\K\x20 and the replacement regex _

                    • Now, regarding the regex S/R D, note that we already remove the underscores close to the square brackets, with the regex S/R C. So, this time :

                      • BSR, Beginning Search-region Regex, is the regex \x5b

                      • ESR, Ending Search-region Regex, is the regex \x5d

                      • FR, Find Regex, is the regex _

                      • RR, Replacement Regex is the regex \x20

                    • And, again, if we change the names of generic regex with the real regex values, we exactly get the search regex (?-s)(?:\x5b|(?!\A)\G)(?:(?!\x5d).)*?\K_ and the replacement regex \x20 !

                  Best Regards,

                  guy038

                  darkenbD 1 Reply Last reply Reply Quote 4
                  • darkenbD
                    darkenb @guy038
                    last edited by

                    @guy038
                    I swear you are the king. : D Thanks to you, my job has been solved, and I can do it myself to a certain extent when something happens.

                    Thank you very, very much…

                    1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by guy038

                      Hello, @darkenb, @alan-kilborn, @peterjones, @ekopalypse and All,

                      @darkenb :

                      • Regarding the B and D regex S/R, note that I didn’t explain fully how they work. I do think that you need to learn basic regex concepts first, before trying to understand these complicated syntaxes which would just confuse you ;-)

                      To All,

                      • Regarding the A and C regex S/R, they can be simplified and we do not need to use conditional regexes ! Indeed :

                        • Regex S/R A :

                          • SEARCH (\x5b)|(\x5d)

                          • REPLACE \1_\2

                        • Regex S/R C :

                          • SEARCH (\x5b)_|_(\x5d)

                          • REPLACE \1\2

                      • As you can see, the opening square bracket \x5b is stored as group 1 and the ending square bracket \x5d is stored as group 2. And, as the two alternatives are mutually exclusive, we can write, both, \1 and \2 in the replacement zone ( or $1 and $2 ). We know that when one is defined, the other one is undefined and equivalent to an empty string ;-))

                      Best Regards,

                      guy038

                      1 Reply Last reply Reply Quote 1
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors