Community
    • Login

    Changing text between square brackets

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    4 Posts 3 Posters 2.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Patrick CainP
      Patrick Cain
      last edited by

      This question likely has a simple answer, but I know very little about Notepad. I have a number of large documents in .txt. I need to replace text (specifically names FIRST, LAST) appearing between square brackets, but no other text appearing between square brackets. As indicated, the names show as [FIRST, LAST] and sometimes [FIRST, LAST MI.] The text I wish to leave untouched appears uniformly as [WORD]. Thanks in advance for assisting a neophyte.

      Scott SumnerS 1 Reply Last reply Reply Quote 0
      • Scott SumnerS
        Scott Sumner @Patrick Cain
        last edited by

        @Patrick-Cain

        You don’t say what your replacement looks like, but as far as finding the matches you say you’re looking for, search this way:

        Find what: \[\w+, \w+( \w\.)?\]
        Search mode: Regular expression

        The basic interpretation of this is:

        • Find a [
        • Followed by one or more “word” characters (which are A-Z, a-z, 0-9, and _) – you don’t need 0-9 and _ but likely no harm in it
        • Followed by a comma and space
        • Followed by one or more “word” chars
        • Followed by an optional space+word-char+period
        • Followed by a ]
        1 Reply Last reply Reply Quote 0
        • Scott SumnerS
          Scott Sumner
          last edited by

          So, like all things, this can get a bit complicated…

          I was experimenting with named-groups with this for my own “fun” (trying to make regexes more readable) and came up with this possible replacement example:

          Find what: (?-i)\[(?<first>[A-Z][a-z]+),(?<flsep>\x20|\r\n)(?<last>[A-Z][a-z]+)(?:(?<lmisep>\x20|\r\n)(?<mi>[A-Z])\.)?\]
          Replace with: $+{last},$+{flsep}$+{first}(?{mi}$+{lmisep}$+{mi})
          Search mode: Regular expression`

          This will take [First, Last] or [First, Last M.] and convert it to Last, First or Last, First M respectively. It will work if a Windows line-ending (\r\n) occurs at a reasonable place inside the find string.

          The key point for my “fun” was that a regex-grouping in the “find” part can be named via this syntax: (?<my_name>...) and can be used in the ‘replace’ part via $+{my_name} or tested-for in this manner: (?{my_name}...) (this “test-for” feature is used in the earlier replace-with expression to see if the optional middle-initial exists…and if so, what to insert into the replacement text if it does).

          Sample input data:

          Lorem ipsum dolor sit amet, [Vivan, Shurtliff] consectetur adipiscing elit.  Ut
          blandit viverra diam luctus luctus.  In [Kirby, Heidt M.] tellus nunc, dapibus id
          gravida vel, lacinia venenatis augue.  Nunc [Jessie, Mulford] sagittis rhoncus
          hendrerit.  Sed vel augue nisi, vel sagittis sem.  [Taren, Fish] Aenean ante
          diam, rutrum ut eleifend in, convallis sed est.  Class due anti [Rhett, Himes
          P.] Pellentesque eu tempor et interdum quis, molestie commodo tempor et interdum
          ante quis metus dictum feugiat.  Ut blandit volutpat [Harland, Hutzler] ante in
          commodo.  Duis quam lorem, lacinia nec tempus non, [Lino, Bureau] tristique sed
          turpis.  In id est mi.  Class aptent taciti [Ivana, Mechem Z.] sociosqu ad litora
          torquent per conubia nostra, per inceptos himenaeos.  [James, Mcbride F.] Nunc
          ipsum libero, tempor et interdum quis, molestie commodo mauris.  [Felecia,
          Menendez] Fusce tempor, felis vel pellentesque luctus, enim lacus sagittis
          arcu, [Bradly, Blackledge] at mollis tellus mauris in dui.  Nunc vel leo velit.
          [Obdulia, Ocana] Aliquam sit amet erat sit amet elit consequat tempor.
          

          Sample output data:

          Lorem ipsum dolor sit amet, Shurtliff, Vivan consectetur adipiscing elit.  Ut
          blandit viverra diam luctus luctus.  In Heidt, Kirby M tellus nunc, dapibus id
          gravida vel, lacinia venenatis augue.  Nunc Mulford, Jessie sagittis rhoncus
          hendrerit.  Sed vel augue nisi, vel sagittis sem.  Fish, Taren Aenean ante
          diam, rutrum ut eleifend in, convallis sed est.  Class due anti Himes, Rhett
          P Pellentesque eu tempor et interdum quis, molestie commodo tempor et interdum
          ante quis metus dictum feugiat.  Ut blandit volutpat Hutzler, Harland ante in
          commodo.  Duis quam lorem, lacinia nec tempus non, Bureau, Lino tristique sed
          turpis.  In id est mi.  Class aptent taciti Mechem, Ivana Z sociosqu ad litora
          torquent per conubia nostra, per inceptos himenaeos.  Mcbride, James F Nunc
          ipsum libero, tempor et interdum quis, molestie commodo mauris.  Menendez,
          Felecia Fusce tempor, felis vel pellentesque luctus, enim lacus sagittis
          arcu, Blackledge, Bradly at mollis tellus mauris in dui.  Nunc vel leo velit.
          Ocana, Obdulia Aliquam sit amet erat sit amet elit consequat tempor.
          

          If anyone is still reading you get internet-points for endurance.

          1 Reply Last reply Reply Quote 1
          • guy038G
            guy038
            last edited by guy038

            Hi, @scott-sumner and All,

            Ah, yes, Scott, using named capturing groups is a solution for documented regexes. But there a nice other way to get correct regexes, with a lot of comments !

            I tried to rewrite your S/R, with named groups, using the following template :

            SEARCH :

            (?x)
            (?-i)      # The search is NON-insensitive ( => Sensitive ! )
            \[         # A single opening square bracket ( ESCAPED as special char. )
            (          # Beginning of group 1 ( First Name )
            [A-Z]      # A single capital letter
            [a-z]+     # A NON-null range of lower-case letters
            )          # End of group 1
            ,          # A single comma character
            (          # Beginning of group 2 ( FL separator )
            \x20|\r\n  # A single space character OR the TWO Window End of Line characters
            )          # End of group 2
            (          # Beginning of group 3 ( Last Name )
            [A-Z]      # A single capital letter
            [a-z]+     # A NON-null range of lower-case letters
            )          # End of group 3
            (?:        # Beginning of an OPTIONAL, non-capturing, group
            (          # Beginning of group 4 ( MI separator )
            \x20|\r\n  # A single space character OR the TWO Window End of Line characters
            )          # End of group 4
            (          # Beginning of group 5 ( Middle Initial )
            [A-Z]      # A single capital letter
            )          # End of group 5
            \.         # A single dot character ( ESCAPED as special char. )
            )?         # End of the OPTIONAL group 5
            \]         # A single ending square bracket ( ESCAPED as special char. )
            

            Unfortunately, this way of writing does NOT work in the replacement part :

            # The replacement part CANNOT be split in SEVERAL lines !!
            #
            # \3,      # Last name is written first, followed by a comma
            # \2       # Then, we add the FL separator
            # \1       # Then, the First name is written
            # ?5       # And if group 5 ( Middle Initial ) exists :
            # \4\5     #     We rewrite group 4 ( MI separator ), followed by group 5 ( Middle Initial )
            

            => REPLACEMENT :

            \3,\2\1?5\4\5
            

            Now :

            • Select all the lines of the SEARCH part, above, between (?x) and \]

            • Copy them, in the clipboard, with a Ctrl + C shortcut

            • Paste, first, this selection, in your current file, with a Ctrl + V shortcut

            • Re-select this text, representing the search part

            • Open the Replace dialog ( Ctrl + H )

            • Paste the correct replacement regex, above, in the Replace with: zone

            • Select the Regular expresion search mode

            • Click on the Replace All button

            Et voilà !!


            Notes :

            • Once the search part selected, DON’T copy this selection in the clipboard, for further pasting, in the Find what: zone, of the Replace dialog ! Simply, open the Replace dialog :-) : The selection will be filled in the Find what: zone, automatically :-)

            • The syntax (?x) syntax MUST begin the subsequent lines, of the regex. This modifier starts a free-spacing and comment way of writing regexes, with a # character, beginning the comment part

            • As, in this mode, the space character is simply ignored, if you search for a space character, you’ll have to use one of the three following syntaxes : \ , [ ] or \x20

            Best Regards,

            guy038

            1 Reply Last reply Reply Quote 0
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors