Changing text between square brackets
-
This question likely has a simple answer, but I know very little about Notepad. I have a number of large documents in .txt. I need to replace text (specifically names FIRST, LAST) appearing between square brackets, but no other text appearing between square brackets. As indicated, the names show as [FIRST, LAST] and sometimes [FIRST, LAST MI.] The text I wish to leave untouched appears uniformly as [WORD]. Thanks in advance for assisting a neophyte.
-
You don’t say what your replacement looks like, but as far as finding the matches you say you’re looking for, search this way:
Find what:
\[\w+, \w+( \w\.)?\]
Search mode:Regular expression
The basic interpretation of this is:
- Find a [
- Followed by one or more “word” characters (which are A-Z, a-z, 0-9, and _) – you don’t need 0-9 and _ but likely no harm in it
- Followed by a comma and space
- Followed by one or more “word” chars
- Followed by an optional space+word-char+period
- Followed by a ]
-
So, like all things, this can get a bit complicated…
I was experimenting with named-groups with this for my own “fun” (trying to make regexes more readable) and came up with this possible replacement example:
Find what:
(?-i)\[(?<first>[A-Z][a-z]+),(?<flsep>\x20|\r\n)(?<last>[A-Z][a-z]+)(?:(?<lmisep>\x20|\r\n)(?<mi>[A-Z])\.)?\]
Replace with:$+{last},$+{flsep}$+{first}(?{mi}$+{lmisep}$+{mi})
Search mode: Regular expression`This will take
[First, Last]
or[First, Last M.]
and convert it toLast, First
orLast, First M
respectively. It will work if a Windows line-ending (\r\n
) occurs at a reasonable place inside the find string.The key point for my “fun” was that a regex-grouping in the “find” part can be named via this syntax:
(?<my_name>...)
and can be used in the ‘replace’ part via$+{my_name}
or tested-for in this manner:(?{my_name}...)
(this “test-for” feature is used in the earlier replace-with expression to see if the optional middle-initial exists…and if so, what to insert into the replacement text if it does).Sample input data:
Lorem ipsum dolor sit amet, [Vivan, Shurtliff] consectetur adipiscing elit. Ut blandit viverra diam luctus luctus. In [Kirby, Heidt M.] tellus nunc, dapibus id gravida vel, lacinia venenatis augue. Nunc [Jessie, Mulford] sagittis rhoncus hendrerit. Sed vel augue nisi, vel sagittis sem. [Taren, Fish] Aenean ante diam, rutrum ut eleifend in, convallis sed est. Class due anti [Rhett, Himes P.] Pellentesque eu tempor et interdum quis, molestie commodo tempor et interdum ante quis metus dictum feugiat. Ut blandit volutpat [Harland, Hutzler] ante in commodo. Duis quam lorem, lacinia nec tempus non, [Lino, Bureau] tristique sed turpis. In id est mi. Class aptent taciti [Ivana, Mechem Z.] sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. [James, Mcbride F.] Nunc ipsum libero, tempor et interdum quis, molestie commodo mauris. [Felecia, Menendez] Fusce tempor, felis vel pellentesque luctus, enim lacus sagittis arcu, [Bradly, Blackledge] at mollis tellus mauris in dui. Nunc vel leo velit. [Obdulia, Ocana] Aliquam sit amet erat sit amet elit consequat tempor.
Sample output data:
Lorem ipsum dolor sit amet, Shurtliff, Vivan consectetur adipiscing elit. Ut blandit viverra diam luctus luctus. In Heidt, Kirby M tellus nunc, dapibus id gravida vel, lacinia venenatis augue. Nunc Mulford, Jessie sagittis rhoncus hendrerit. Sed vel augue nisi, vel sagittis sem. Fish, Taren Aenean ante diam, rutrum ut eleifend in, convallis sed est. Class due anti Himes, Rhett P Pellentesque eu tempor et interdum quis, molestie commodo tempor et interdum ante quis metus dictum feugiat. Ut blandit volutpat Hutzler, Harland ante in commodo. Duis quam lorem, lacinia nec tempus non, Bureau, Lino tristique sed turpis. In id est mi. Class aptent taciti Mechem, Ivana Z sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Mcbride, James F Nunc ipsum libero, tempor et interdum quis, molestie commodo mauris. Menendez, Felecia Fusce tempor, felis vel pellentesque luctus, enim lacus sagittis arcu, Blackledge, Bradly at mollis tellus mauris in dui. Nunc vel leo velit. Ocana, Obdulia Aliquam sit amet erat sit amet elit consequat tempor.
If anyone is still reading you get internet-points for endurance.
-
Hi, @scott-sumner and All,
Ah, yes, Scott, using named capturing groups is a solution for documented regexes. But there a nice other way to get correct regexes, with a lot of comments !
I tried to rewrite your S/R, with named groups, using the following template :
SEARCH
:(?x) (?-i) # The search is NON-insensitive ( => Sensitive ! ) \[ # A single opening square bracket ( ESCAPED as special char. ) ( # Beginning of group 1 ( First Name ) [A-Z] # A single capital letter [a-z]+ # A NON-null range of lower-case letters ) # End of group 1 , # A single comma character ( # Beginning of group 2 ( FL separator ) \x20|\r\n # A single space character OR the TWO Window End of Line characters ) # End of group 2 ( # Beginning of group 3 ( Last Name ) [A-Z] # A single capital letter [a-z]+ # A NON-null range of lower-case letters ) # End of group 3 (?: # Beginning of an OPTIONAL, non-capturing, group ( # Beginning of group 4 ( MI separator ) \x20|\r\n # A single space character OR the TWO Window End of Line characters ) # End of group 4 ( # Beginning of group 5 ( Middle Initial ) [A-Z] # A single capital letter ) # End of group 5 \. # A single dot character ( ESCAPED as special char. ) )? # End of the OPTIONAL group 5 \] # A single ending square bracket ( ESCAPED as special char. )
Unfortunately, this way of writing does NOT work in the replacement part :
# The replacement part CANNOT be split in SEVERAL lines !! # # \3, # Last name is written first, followed by a comma # \2 # Then, we add the FL separator # \1 # Then, the First name is written # ?5 # And if group 5 ( Middle Initial ) exists : # \4\5 # We rewrite group 4 ( MI separator ), followed by group 5 ( Middle Initial )
=>
REPLACEMENT
:\3,\2\1?5\4\5
Now :
-
Select all the lines of the SEARCH part, above, between
(?x)
and\]
-
Copy them, in the clipboard, with a
Ctrl + C
shortcut -
Paste, first, this selection, in your current file, with a
Ctrl + V
shortcut -
Re-select this text, representing the search part
-
Open the Replace dialog (
Ctrl + H
) -
Paste the correct replacement regex, above, in the Replace with: zone
-
Select the Regular expresion search mode
-
Click on the Replace All button
Et voilà !!
Notes :
-
Once the search part selected, DON’T copy this selection in the clipboard, for further pasting, in the Find what: zone, of the Replace dialog ! Simply, open the Replace dialog :-) : The selection will be filled in the Find what: zone, automatically :-)
-
The syntax
(?x)
syntax MUST begin the subsequent lines, of the regex. This modifier starts a free-spacing and comment way of writing regexes, with a#
character, beginning the comment part -
As, in this mode, the space character is simply ignored, if you search for a space character, you’ll have to use one of the three following syntaxes :
\
,[ ]
or\x20
Best Regards,
guy038
-