Regex to Find Consecutive Uppercase Letters

Dick Adams 0

I’m tryng to write a regular expression to detect consecutive upper case letters in
an HTML <title>, <h1> or <h2> tag. This is what I’ve come up with so far:

<(title|h\d)>.*?[[:upper:]]{2,}.*?</\1

But this regex returns a false positive when run against this string:

<title>A Very Nice Title</title>

As you can see, the title doesn’t contain consecutive upper case letters.

What am I doing wrong?

(FYI, the forum software won’t let me scroll down to the field where I would enter tags.)

PeterJones

@Dick-Adams-0 ,

As you can see, the title doesn’t contain consecutive upper case letters.

What’s the status of the ☑ Match Case checkbox. Because if it’s not set to match case, then [[:upper:]] will match either upper or lower, just like [A-Z] would match upper or lower – in the same way that A will match either A or a when it’s case insensitive, [[:upper:]] will also match A or a when it’s case insensitive. If you want to truly restrict to [[:upper:]], you will have to use a case sensitive match (either through ☑ Match Case checkbox or (?-i) in the regex to turn off case-insensitive matching)

the forum software won’t let me scroll down to the field where I would enter tags

The “tag” feature on posts is rather irrelevant – I doubt anyone actually uses them when reading the forum to answer, and I would be surprised if anyone bothered when looking for previuos conversations here (in my eight years as a regular of this forum, I’ve never found the tags useful)

Besides, as far as I know, you don’t have to scroll. You just click where it says “Enter tags here, between 3 and 15 characters each”, and start typing, and that’s how you enter the tag.

guy038

Hello, @dick-adams-0, @peterjones and All,

@dick-adams-0, if you want to match all the words, in the title or the h[1-6] sections, which :

Do not begin with an uppercase letter with trailing lowercase letters, use the following regex S/R :
- SEARCH (?-is)(?:\b(\u\l+|\u|title|h[1-6])\b(*SKIP)(*F)|\b\w+\b)(?=.*</(title|h[1-6])>)

Do not begin with an uppercase letter with trailing lowercase letters and are not words with all letters lowercase, as well, use :
- SEARCH (?-is)(?:\b(\u?\l+|\u|title|h[1-6])\b(*SKIP)(*F)|\b\w+\b)(?=.*</(title|h[1-6])>)

Test these regexes against the text, below :

<title>This is A VEry niCE and Small TiTle</title> just a SMALL Test

<h2>An otHEr Title, too</h2>

<text> This is AN old pArt of tExT</text>

<h7>Just a TEST</h7>

<title>This is A VEry niCE and Small TiTle</title> just a SMALL Test

<text> This is AN old pArt of tExT</text>

<h2>An otHEr Title, too</h2>

Best Regards

guy038

Dick Adams 0

@PeterJones Well, I’ll be hornswoggled!

Adding (?-i) was all it took. I didn’t realize the case sensitivity flags had that effect. Never too old to learn something new!

Thanks for the assist!

Alan Kilborn

@Dick-Adams-0 said in Regex to Find Consecutive Uppercase Letters:

I didn’t realize the case sensitivity flags had that effect

A number of people have argued that for regex the Match case checkbox shouldn’t apply, and, unless the (?i) flag is used in the regex, that all regexes should be case sensitive.