Hi, All,
Regex updated and section Notes, added, on 08/22/2017 - 10.45am ( French TZ )
Just realized that my previous regex can be extended to reorganize more than two lists !
For instance, let’s suppose I double the complete list of family and given names, from my previous post, adding the new symbols = and _, in order to get the text, below :
#Smith
#Jones
#Taylor
#Brown
#Williams
#Wilson
#Johnson
#Davies
#Robinson
#Wright
#Thompson
#Evans
#Walker
#White
#Roberts
#Green
#Hall
#Wood
#Jackson
#Clarke
@Oliver
@Amelia
@Jack
@Olivia
@Harry
@Emily
@George
@Isla
@Charlie
@Ava
@Jacob
@Jessica
@Thomas
@Ella
@Noah
@Isabella
@William
@Poppy
@Oscar
@Mia
=Smith
=Jones
=Taylor
=Brown
=Williams
=Wilson
=Johnson
=Davies
=Robinson
=Wright
=Thompson
=Evans
=Walker
=White
=Roberts
=Green
=Hall
=Wood
=Jackson
=Clarke
_Oliver
_Amelia
_Jack
_Olivia
_Harry
_Emily
_George
_Isla
_Charlie
_Ava
_Jacob
_Jessica
_Thomas
_Ella
_Noah
_Isabella
_William
_Poppy
_Oscar
_Mia
Then, the regex S/R :
SEARCH (?-s)^#(.+)\R((?s).*?)@(.+)\R((?s).*?)=(.+)\R((?s).*?)_(.+\R?)
REPLACE \1 \3 \5 \7\2\4\6, with a space character after \1, \3 and \5
would return, after 20 hits, on the ALT + A shortcut ( Replace All ), the single shortened list :
Smith Oliver Smith Oliver
Jones Amelia Jones Amelia
Taylor Jack Taylor Jack
Brown Olivia Brown Olivia
Williams Harry Williams Harry
Wilson Emily Wilson Emily
Johnson George Johnson George
Davies Isla Davies Isla
Robinson Charlie Robinson Charlie
Wright Ava Wright Ava
Thompson Jacob Thompson Jacob
Evans Jessica Evans Jessica
Walker Thomas Walker Thomas
White Ella White Ella
Roberts Noah Roberts Noah
Green Isabella Green Isabella
Hall William Hall William
Wood Poppy Wood Poppy
Jackson Oscar Jackson Oscar
Clarke Mia Clarke Mia
Notes :
The first part (?-s) means that the regex engine will consider, by default, that the dot meta-character matches any single standard character, only
Then the part ^#(.+)\R represents the first complete line, beginning with the # symbol and followed by its End of Line character(s), with part, after symbol #, stored as group 1
Any part, of the form ((?s).*?), is the smallest multi-line range of characters ( standard or EOL ones ) till a User-symbol ( @, = or _ ) and stored as groups 2, 4 or 6
The parts @(.+)\R and =(.+)\R represent the first complete line, beginning with the @ or = symbol and followed by its End of Line character(s), with the part, after the symbol, stored as group 3 and 5
The last part _(.+\R?) stands for the first complete line, beginning with the _ symbol, followed by optional End of Line character(s), and the part, after the _ symbol is stored as group 7
In replacement, the first part, \1 \3 \5 \7, rewrites each line, without its initial User-symbol, separated by a space character, as an unique line, ended by End of Line character(s)
Then, the remaining of the four lists, \2\4\6, is, simply, rewritten, without any change !
The table , below, marks the beginning of each of the seven defined groups :
----------------1-----2---------3-----4---------5-----6---------7------
SEARCH (?-s)^#(.+)\R((?s).*?)@(.+)\R((?s).*?)=(.+)\R((?s).*?)_(.+\R?)
Cheers,
guy038