Hello, @dave-dyet, @terry-r, @peterjones and All,
Here is a 3rd possible solution :
SEARCH (?s-i).+?(\u.+?)(?=<)|(?s).+
REPLACE ?1\1\r\n ( OR ?1\1\n for Unix files )
Notes :
The remainder of text, near the very end of file, is just wiped out. Indeed, when the second alternative (?s).+ is used, the group 1 does not exist. So, no replacement is done, because of the conditional replacement ?1....
I used the \u syntax which matches, when sensitive search is processed, any uppercase letter of any occidental Unicode script ( Latin, Greek, Cyrillic,… ). It’s probably useless, as in English/American language, no country begins with an accentuated character, anyway ! However, regarding this specific case, writing (?-i)\u is as easy as writing (?-i)[A-Z] ! Refer to the list of sovereign states, below :
https://en.wikipedia.org/wiki/List_of_sovereign_states
And we get the text, below :
Africa
Angola
Argentina
Armenia
Asia
Australia
Australia > New South Wales
Australia > Northern Territory
Peter, from your solutiion, I built a new version, which can do all the job, in one go ;-)) So, here is the 4th version :
SEARCH (?-s)<.+?>|^\h*\R?|(.+?)(?=<)
REPLACE ?1\1\r\n ( OR ?1\1\n for Unix files )
Notes :
This regex allows the pertinent items to begin with an lowercase letter, either !
If group 1 does not exist, then the <.....> blocks OR possible leading blank chars, followed with a possible line-break, are deleted
If group1 does exist, then the different items of the drop-down list, are listed, as usual, one per line
Best Regards,
guy038