Community
    • Login

    find and replace help

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 3 Posters 7.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Joe GrantJ
      Joe Grant
      last edited by

      hi all this is my first post.

      i have a text file with something like this

      IF NAME==“C15_x_33.9”
      DEPTH=115
      ENDIF
      IF NAME==“C12_x_30”
      DEPTH=1
      12
      ENDIF

      and i need to do two things

      1. place the name in front of all the lines
        2)if the name has a “.” like 33.9, it needs to be replaced with an underscore 33_9
        so the final format would look like this:

      IF NAME==“C15_x_33.9”
      C15_x_33_9_DEPTH=115
      ENDIF
      IF NAME==“C12_x_30”
      C12_x_30_DEPTH=1
      12
      ENDIF

      so i have thousands of these and i have been trying and im just about to give up. thought i would give this community a shot before i quit

      i have been trying for hours and hours using regular expressions and i cant get it to work.
      any help would be greatly appreciated.

      thank you,
      Jg

      1 Reply Last reply Reply Quote 0
      • Claudia FrankC
        Claudia Frank
        last edited by

        Hello Joe Grant,

        I’m not particularly good in regex but I hope I have a solution for you.
        From the given example I would solve it like so

        Find C… and linebreaks and DEPTH… and replace C… line with the same and D… line with C line added in front.

        Find what: (C.*)(")\r\n(DEPTH.*)
        Replace with: \1\2\r\n\1_\3
        

        Next is to replace the dot with underscore

        Find what: (C.*)(\.)(.*DEPTH.*)
        Replace with:\1_\3
        

        So as you see \1 is the first match \2 second and so on …

        As this regex is working by the provided example it might be
        that real data is affected differently because my assumptions
        aren’t valid. Assumptions like the IF… line only has the C char in the name part,
        never before or afterwards and …

        Cheers
        Claudia

        1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by guy038

          Hello Joe,

          Many modifications can be done with the help of regular expressions :-))

          First of all, we miss some points about your file :

          • May the different names contain more than one dot as, for instance, C15.x.33.9 ? I suppose NOT, as the final number seems to be either an integer or a float number, doesn’t it ?

          • May the IF - ENDIF structure contain more than one line ?

          For instance :

          IF NAME==“C15_x_33.9”
          DEPTH=115
          LENGTH=70
          WIDTH=30
          ENDIF
          

          which should be, therefore, replaced with :

          IF NAME==“C15_x_33.9”
          C15_x_33_9_DEPTH=115
          C15_x_33_9_LENGTH=70
          C15_x_33_9_WIDTH=30
          ENDIF
          

          I just relied on your present example, with a IF - ENDIF structure which contains ONE line only !

          • When I copied your text, in a new tab, with CTRL-C / CTRL-V, the two standard double quotes ( " ) were changed into the LEFT DOUBLE QUOTATION MARK “ ( \x{201c} ) and the RIGHT DOUBLE QUOTATION MARK ” ( \x{201d} ) I will assume that you rather use the standard QUOTATION MARK, don’t you ?

          Well, with these hypotheses, and, in additiion to the Claudia’s solution, I would suggest the following S/R :

          SEARCH (?-s)^IF NAME=="(?|(.+)(\.)(.*)|(.+))"\R\K

          REPLACE \1_(?2\3_)

          • Don’t forget to select the Regular expression search mode !

          • Click on the Replace All button ONLY ( Due to the \K syntax, you must NOT use the Replace button !! )


          At first sight, that regex seems difficult, but it’s a nice opportunity to explore :

          • The internal modifiers (?s)

          • The branch reset alternative pattern (?|...|...|...|...)

          • The line ending escape sequence \R

          • The kept back form \K

          • The conditional replacement pattern (?#.....)


          So :

          • The (?-s) form is an modifier that means that the dot character matches a standard character only. The opposite form (?s) means that the dot can match any character, even end of line characters.

          • If your condition IF NAME may occur, in lowercase, just add the insensitive modifier i => your regex will, then, begin with (?i-s)

          • Note that these modifiers have priority on the same options, in the Replace dialog ( Match case and . matches newline options )

          • The part ^IF NAME==" just tries to match the literal string IF NAME==", at the beginning of a line

          • The part (?|(.+)(\.)(.*)|(.+))" is an alternative, that looks :

            • For any non null range of characters, followed with a literal dot, then followed with any range, possibly null, of characters
              OR
            • For any non null range of characters
          • In that piece of the regex :

            • The literal dot have to be escaped, as it’s a special character in regexes

            • Either, the dot and the parts, before and after it, are surrounded by parentheses, in order to consider them single groups, generally re-used the the replacement regex

            • Due to the ?| syntax at the beginning of the alternative (....|....), the group numbering is reset, for each branch of the alternative :

              • If the first alternative is chosen ( case where the name contains a dot ), the part before the dot is group 1, the dot represents the group 2 and the part after the dot is the group 3

              • If the second alternative matched ( case the name does NOT contain a dot ), the single group (.+) is considered, again, to be the group 1

            • Whatever alternative matches the name, it must match the ending quote character

          • The \R exactly represents the atomic group (?>\x0d\x0a?|[\x0a-\x0c\x85\x{2028}\x{2029}]), but, practically, we just have to remember that it matches any standard EOL : \r\n, for Windows files, \n, for Unix files or \r for old Mac files

          • Finally, in the search regex, due to the \K syntax, everything already matched ( that is to say, the complete line with its EOL characters ) is “forgotten”, so the final regex matched is, only, the null string, located between the EOL character \n and the first letter D, of the word DEPTH

          This null string is, then, replaced with :

          • The group 1 ( part, of the name, before the dot OR the entire name ) followed by an underscore => \1_ )

          • If a dot has been found in the name( ìf group 2 exists ), we must re-write the part of the name, after the dot ( group 3 ), followed, again, with an underscore => (?2\3_). Note that the general form of a conditional replacement is (?#....:....). For instance (?4abc:xyz) means the string *abc is rewritten, if group 4 EXISTS and the string xyz is rewritten, if the group 4 could NOT be defined

          Best Regards,

          guy038

          P.S. :

          You’ll find good documentation, about the new Boost C++ Regex library ( similar to the PERL Regular Common Expressions ) used by Notepad++, since the 6.0 version, at the TWO addresses below :

          http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html

          http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html

          • The FIRST link explains the syntax, of regular expressions, in the SEARCH part

          • The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part

          Claudia FrankC 1 Reply Last reply Reply Quote 1
          • Claudia FrankC
            Claudia Frank @guy038
            last edited by

            Hi guy038,

            AGAIN, a nice one and a very good description too, even I understood it.
            But, there is always a but, did you notice that your regex seems to break
            the replace (don’t know how to say it in other words) function?
            What I mean is if you use your regex and press find next button,
            it selects the DEPTH… line and if you press replace button nothing
            gets changed, where as you press the replace all button, it will be replaced.
            Do you think this is a bug or is it because of the complex regex?

            Tested with npp6.8.7 and 6.8.8 on windows 7 x64.

            Cheers
            Claudia

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hi Claudia,

              No, It’s not related to the complexity of the regex ! It’s just that the step-by-step replace doesn’t work at all, as soon as the search regex contains, at least, one \K form :-(( Though I don’t know exactly why !?

              Consider the subject string below :

              abc
              abcdef
              abcdefghi
              abcdefghidefjkl
              

              With the simple S/R SEARCH abc\Kdef and REPLACE 123, if I click on the Replace All button, we get the right text :

              abc
              abc123
              abc123ghi
              abc123ghidefjkl
              

              Note that the second string def has not been changed, because it wasn’t just after an abc string. That’s correct !

              On the contrary, if I click, several times on the Replace button, nothing has changed !!!

              Cheers,

              guy038

              P.S.:

              I’ve just realized that the bug exists too, if we use a look-behind, instead of the \K form !

              So, the S/R SEARCH (?<=abc)def and REPLACE 123 does the job, if you click on the Replace All button, ONLY !

              Remember that, due to the look-behind feature, this regex tries to match a def string, only if preceded by the string abc

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors