Add a new name before and after names that start with a capital letter



  • Hello

    I have a very large text and I want to make a change on the words that start with a capital letter like for example :

    several games like Sonic, Mario, Spiderman, etc…
    Games by excel!

    Make it like this:

    several games like Myname1 Sonic Myname2, Myname1 Mario Myname2, Myname1 Spiderman Myname2, etc.
    Myname1 Games Myname2 by excel!

    Thank’s for the help



  • @El-kohen-Amal said in Add a new name before and after names that start with a capital letter:

    I want to make a change on the words that start with a capital letter

    This is fairly simple. Using regular expressions (regex) we have (using Replace function)
    Find What:(\u\w+)
    Replace With:Myname1 \1 Myname2

    As it’s a regular expression the search mode MUST be regular expression.

    As an explanation:
    \u means find an upper case character, so from A to Z.
    \w+ means find 1 (or more) word characters. Now as well as including the lower case a to z it also includes numbers and the underscore _. If you want to include ONLY a to z characters use \u\l+ in the Find What field instead.
    \1 refers to the group captured in the Find What field (inside the ( )).

    I suggest you start reading some of the regex documentation listed under the FAQ section of this forum. Get to know the basics of regex. It doesn’t take much to learn this basic expression. There are many edits you can do with only a limited/basic knowledge of regex.

    Terry

    PS I should add that the \w includes upper and lower case characters, numbers and underscore. The \l is ONLY lower case characters so it is slightly different. I did assume that any word starting with a capital would ONLY have the first letter capitalised so using \w isn’t a problem. Just bear that in mind.



  • HI,

    I used the code you sent me, and I got the result on
    the following text:

    Myname1 several Myname2 Myname1 games Myname2 Myname1 like Myname2 Myname1 Sonic Myname2, Myname1 Mario Myname2, Myname1 Spiderman Myname2, Myname1 etc Myname2…
    Myname1 Games Myname2 Myname1 by Myname2 Myname1 excel Myname2!

    I wish that the change will be made only on words that contains at the beginning a capital letter like :

    several games like Myname1 Sonic Myname2, Myname1 Mario Myname2, Myname1 Spiderman Myname2, etc.
    Myname1 Games Myname2 by excel!

    Thanks again



  • @El-kohen-Amal said in Add a new name before and after names that start with a capital letter:

    I wish that the change will be made only on words that contains at the beginning a capital letter

    That’s what @Terry-R’s regular expression will do… but only if you have “Match Case” enabled, or if you use (?-i) at the beginning of the regular expression:

    • without match case = finds 10 matches 6d7e57a5-724d-4138-be86-dbc75bdb64a1-image.png
    • with match case = find 4 matches: 2f199213-8c6e-4a36-bd5c-b6c05c2e41c2-image.png
    • without match case, but with (?-i) prefixing finds 4 matches: 692fa0d7-b289-4105-82b3-5c23d5fc72c4-image.png

    Thus, to ignore the status of the “match case” setting, I would recommend (?-i)(\u\w+) as the Find What search expression. (The replacement is the same as @Terry-R showed; I was just using the Find > Count to show number of matches quickly.)



  • Thank you Terry R and PeterJones for your support!
    The codes are working well!



  • Thanks again



  • @El-kohen-Amal said in Add a new name before and after names that start with a capital letter:

    I wish that the change will be made only on words that contains at the beginning a capital letter like :

    Apologies. I had been playing with both the “match case” button and (?-i) on and off as I was getting some conflicting results. When I finally presented my solution I must have had the match case button still ticked and didn’t realise.

    Thanks to @PeterJones for coming to the rescue.

    Terry

    Aside to @PeterJones and @guy038 This seems counterproductive if a meta character such as \u which should mean upper case ALSO needs the match case or (?-i) to make it work correctly. Am I missing an important concept or is this a known issue with the regex engine?



  • @Terry-R said in Add a new name before and after names that start with a capital letter:

    Am I missing an important concept or is this a known issue with the regex engine?

    My guess is it’s just a quirk of the regex engine. If you wrote [A-Z], you wouldn’t be surprised that case-insensitive match makes it also match a-z… but when you obscure it with \u, it might seem somehow different to an end-user, but to the developer of the engine, “uppercase is uppercase, whether it’s explicit or part of an escape”. Put another way, in my interpretation, \u is just a shorthand notation for something that really is effectively equivalent to [A-Z] (or its much larger Unicode equivalent) – but I don’t know whether the implementation makes it exactly equivalent to the described character set or not.



  • Hello, @terry-r, @peterjones and all,

    Surprising ! I’d never really noticed this particularity !

    And, indeed, whatever the Boost valid syntaxes, below :

    [[:upper:]], [[:u:]], \p{upper}, \p{u}, \pu, \u, [A-Z], [\x41-\x5A], [\x{41}-\x{5A}] or [\x{0041}-\x{005A}]

    they all select one uppercase or lowercase letter, if you don’t tick the Match case option AND don’t have any (?-i) modifier in regex !

    I also tried these forms [@-\[] and [\x40-\x5B], so one char before A till one char after Z. But, oddly, in that case, it just matches the @ and the [ characters !


    Now, If you think about a consecutive range of characters, by code-point, you could suppose, at first sight, that the regex [A-Z] would match only upper-case letters, which is a non-discontiguous set of upper-case characters.

    But, what about the Unicode block Latin Extended-A, for instance. It contains all these characters:

    ĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſ

    where, anyway, the upper-case and lower-case letters are mixed. So the search for a contiguous range of letters, with the same case, seems impossible, as our N++ Boost library is not built with full Unicode support !


    In conclusion, I would say that, whenever we need to distinguish the case of letters in regexes, either :

    • Tick the Match case option, in dialogs

    • Insert a (?-i) modifier within the regex

    For instance, the search of 1 upper-case letter, at a time, inside the Latin Extended-A Unicode block above, works, as expected, with the regex (?-i)\u

    To my mind, the Peter’s interpretation seems the most plausible !

    Best Regards,

    guy038



  • @guy038 said,

    regex (?-i)\u

    ooh, as I looked at that juxtaposition this time, I realized: if we move the character escape sequence inside the option-parentheses, then @Terry-R can have the “must be this case, no matter what options are elsewhere” expressions: use (?-i:\u) for must-be-upper-no-matter-what and (?-i:\l) for must-be-lower-no-matter-what. It’s more to write, and we’ll forget that just as often as we forget to use (?-is) at the beginning of all our regex help, but…



  • Hi, @terry-r, @peterjones and all,

    Nice find Peter ;-)) Indeed, if we consider the contents of a non-capturing group (?-i:.............) :

    • Any regex literal letter, inside this non-capturing group, matches the letter, itself, with its exact case

    • Any regex \u syntax, inside this non-capturing group, matches one upper-case letter

    • Any regex \l syntax, inside this non-capturing group, matches one lower-case letter

    • Any regex [A-Z] syntax, inside this non-capturing group, matches one letter, between A and Z

    • Any regex [a-z] syntax, inside this non-capturing group, matches one letter, between a and z

    And this, whatever a possible i modifier, used, before, in the overall regex and/or whatever the Match case option is ticked or not !

    BR

    guy038



  • @PeterJones said in Add a new name before and after names that start with a capital letter:

    use (?-is) at the beginning

    My rule of thumb is becoming to type that, first thing after calling up the Find window, when I know I’m doing regex.

    I can always modify it later, but it is a good starting point.
    The discussion in this thread about case bears that out.


Log in to reply