Add a new name before and after names that start with a capital letter
-
Hello
I have a very large text and I want to make a change on the words that start with a capital letter like for example :
several games like Sonic, Mario, Spiderman, etc…
Games by excel!Make it like this:
several games like Myname1 Sonic Myname2, Myname1 Mario Myname2, Myname1 Spiderman Myname2, etc.
Myname1 Games Myname2 by excel!Thank’s for the help
-
@El-kohen-Amal said in Add a new name before and after names that start with a capital letter:
I want to make a change on the words that start with a capital letter
This is fairly simple. Using regular expressions (regex) we have (using Replace function)
Find What:(\u\w+)
Replace With:Myname1 \1 Myname2
As it’s a regular expression the search mode MUST be regular expression.
As an explanation:
\u
means find an upper case character, so fromA
toZ
.
\w+
means find 1 (or more) word characters. Now as well as including the lower casea
toz
it also includes numbers and the underscore_
. If you want to include ONLYa
toz
characters use\u\l+
in the Find What field instead.
\1
refers to the group captured in the Find What field (inside the(
)
).I suggest you start reading some of the regex documentation listed under the FAQ section of this forum. Get to know the basics of regex. It doesn’t take much to learn this basic expression. There are many edits you can do with only a limited/basic knowledge of regex.
Terry
PS I should add that the
\w
includes upper and lower case characters, numbers and underscore. The\l
is ONLY lower case characters so it is slightly different. I did assume that any word starting with a capital would ONLY have the first letter capitalised so using\w
isn’t a problem. Just bear that in mind. -
HI,
I used the code you sent me, and I got the result on
the following text:Myname1 several Myname2 Myname1 games Myname2 Myname1 like Myname2 Myname1 Sonic Myname2, Myname1 Mario Myname2, Myname1 Spiderman Myname2, Myname1 etc Myname2…
Myname1 Games Myname2 Myname1 by Myname2 Myname1 excel Myname2!I wish that the change will be made only on words that contains at the beginning a capital letter like :
several games like Myname1 Sonic Myname2, Myname1 Mario Myname2, Myname1 Spiderman Myname2, etc.
Myname1 Games Myname2 by excel!Thanks again
-
@El-kohen-Amal said in Add a new name before and after names that start with a capital letter:
I wish that the change will be made only on words that contains at the beginning a capital letter
That’s what @Terry-R’s regular expression will do… but only if you have “Match Case” enabled, or if you use
(?-i)
at the beginning of the regular expression:- without match case = finds 10 matches
- with match case = find 4 matches:
- without match case, but with
(?-i)
prefixing finds 4 matches:
Thus, to ignore the status of the “match case” setting, I would recommend
(?-i)(\u\w+)
as the Find What search expression. (The replacement is the same as @Terry-R showed; I was just using the Find > Count to show number of matches quickly.) -
Thank you Terry R and PeterJones for your support!
The codes are working well! -
Thanks again
-
@El-kohen-Amal said in Add a new name before and after names that start with a capital letter:
I wish that the change will be made only on words that contains at the beginning a capital letter like :
Apologies. I had been playing with both the “match case” button and
(?-i)
on and off as I was getting some conflicting results. When I finally presented my solution I must have had the match case button still ticked and didn’t realise.Thanks to @PeterJones for coming to the rescue.
Terry
Aside to @PeterJones and @guy038 This seems counterproductive if a meta character such as
\u
which should mean upper case ALSO needs the match case or(?-i)
to make it work correctly. Am I missing an important concept or is this a known issue with the regex engine? -
@Terry-R said in Add a new name before and after names that start with a capital letter:
Am I missing an important concept or is this a known issue with the regex engine?
My guess is it’s just a quirk of the regex engine. If you wrote
[A-Z]
, you wouldn’t be surprised that case-insensitive match makes it also matcha-z
… but when you obscure it with\u
, it might seem somehow different to an end-user, but to the developer of the engine, “uppercase is uppercase, whether it’s explicit or part of an escape”. Put another way, in my interpretation,\u
is just a shorthand notation for something that really is effectively equivalent to[A-Z]
(or its much larger Unicode equivalent) – but I don’t know whether the implementation makes it exactly equivalent to the described character set or not. -
Hello, @terry-r, @peterjones and all,
Surprising ! I’d never really noticed this particularity !
And, indeed, whatever the
Boost
valid syntaxes, below :[[:upper:]]
,[[:u:]]
,\p{upper}
,\p{u}
,\pu
,\u
,[A-Z]
,[\x41-\x5A]
,[\x{41}-\x{5A}]
or[\x{0041}-\x{005A}]
they all select one
uppercase
orlowercase
letter, if you don’t tick theMatch case
option AND don’t have any(?-i)
modifier in regex !I also tried these forms
[@-\\[]
and[\x40-\x5B]
, so one char beforeA
till one char afterZ
. But, oddly, in that case, it just matches the@
and the[
characters !
Now, If you think about a consecutive range of characters, by code-point, you could suppose, at first sight, that the regex
[A-Z]
would match only upper-case letters, which is a non-discontiguous set of upper-case characters.But, what about the Unicode block
Latin Extended-A
, for instance. It contains all these characters:ĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſ
where, anyway, the upper-case and lower-case letters are mixed. So the search for a contiguous range of letters, with the same case, seems impossible, as our N++
Boost
library is not built with fullUnicode
support !
In conclusion, I would say that, whenever we need to distinguish the case of letters in regexes, either :
-
Tick the
Match case
option, in dialogs -
Insert a
(?-i)
modifier within the regex
For instance, the search of
1
upper-case letter, at a time, inside theLatin Extended-A
Unicode block above, works, as expected, with the regex(?-i)\u
To my mind, the Peter’s interpretation seems the most plausible !
Best Regards,
guy038
-
-
@guy038 said,
regex
(?-i)\u
ooh, as I looked at that juxtaposition this time, I realized: if we move the character escape sequence inside the option-parentheses, then @Terry-R can have the “must be this case, no matter what options are elsewhere” expressions: use
(?-i:\u)
for must-be-upper-no-matter-what and(?-i:\l)
for must-be-lower-no-matter-what. It’s more to write, and we’ll forget that just as often as we forget to use(?-is)
at the beginning of all our regex help, but… -
Hi, @terry-r, @peterjones and all,
Nice find Peter ;-)) Indeed, if we consider the contents of a non-capturing group
(?-i:.............)
:-
Any regex literal letter, inside this non-capturing group, matches the letter, itself, with its
exact
case -
Any regex
\u
syntax, inside this non-capturing group, matches oneupper-case
letter -
Any regex
\l
syntax, inside this non-capturing group, matches onelower-case
letter -
Any regex
[A-Z]
syntax, inside this non-capturing group, matches one letter, betweenA
andZ
-
Any regex
[a-z]
syntax, inside this non-capturing group, matches one letter, betweena
andz
And this, whatever a possible
i
modifier, used, before, in the overall regex and/or whatever theMatch case
option is ticked or not !BR
guy038
-
-
@PeterJones said in Add a new name before and after names that start with a capital letter:
use (?-is) at the beginning
My rule of thumb is becoming to type that, first thing after calling up the Find window, when I know I’m doing regex.
I can always modify it later, but it is a good starting point.
The discussion in this thread about case bears that out.