Regex Macro: Creating Macro to Replace variable text with text determined by text on next line
- 
I would use the macro twice - once for 1841Census, then again for 1861Census (unless you can create a macro that will replace all occurrences iteratively!)
 - 
Hello, @john-slee and All,
You said :
e.g. In the following text I want to Replace all occurrences of X382 by 1841Census and all occurrences of X391 by 1861Census
If so, the correct regex syntax which will process all values in one go, is :
SEARCH
(?-i)\bX3((82)|91)\bREPLACE
18(?{2}4:6)1CensusNotes :
- 
The part
(?-i)forces a non-insensitive search - 
The part
X3looks for the string X3, with that exact case - 
The part
((82)|91)means that the string X3 must be followed with, either, the number 82 or the number 91 - 
The inner parentheses represents the group
2. So, if it matches the string X391, then group2is not defined - 
The two assertions
\bforces the string X… to be surrounded with non-word chars for matching - 
In replacement :
- 
It first writes the string 18
 - 
According to group
2, the part(?{2}4:6)rewrites digit 4 or digit 6 - 
it finally writes ths string 1Census
 
 - 
 
Of course, if you need to run this regex S/R very often, it would be sensible to record this S/R in a macro, and use it with a shortcut ;-))
Best Regards,
guy038
 - 
 - 
@guy038 Thanks for your suggestion. However, I should have made it clearer that the text to be replaced can vary and is not predictable. I therefore need to replace any text between ===<span id’ = and '> and replace it with the relevant text of the form nnnnCensus with nnnn being taken from the first four digits of the following line.
 - 
Hello, @john-slee and All,
Ah… I’m sorry ! I should have examined your text more carefully. No trouble, there’s a solution, anyway !
If you’ll use the
Replace Allbutton, exclusively, here is the right regex S/R, which may be used, either, in a macro :SEARCH
(?-si)===<span id='\K.+?(?='.+\R(\d+))REPLACE
\1CensusNotes :
- 
The
(?-si)in-line modifiers forces the regex engine :- 
To consider any dot
.symbol as matching a single standard char, and not any EOL character - 
To process the S/R in a non-insensitive way
 
 - 
 - 
Then, the
===<span id='matches the identical string ===<span id=’ - 
The special
\Ksyntax resets the match process and the regex engine position - 
Therefore, the part
.+?matches the shortest range of standard characters… ( our string X### ) - 
With the condition, due to the look-ahead structure
(?=.........), that it must be followed with :- 
A single quote and some standard characters
'.+ - 
Followed with a line-break
\Rof current line - 
Followed with some digits characters, stored as group
1, due to the embedded parentheses(\d+) 
 - 
 - 
In the replacement
\1Census, the match ( string X### ) is replaced with the number, located at beginning of next line, followed with the string Census 
Now, if you want to see, at once, the result of each step by step replacement, use this alternate syntax :
SEARCH
(?-si)(===<span id=').+?(?='.+\R(\d+))REPLACE
\1\2CensusNotes :
- 
The
\Ksyntax is not present. So, the literal string ===<span id=’ is embedded, itself, in parentheses as the group1(===<span id=')and will be re-used in replacement. And, the(\d+)represents the group2 - 
In replacement, it first rewrites the beginning of current line
\1, followed with the number\2, at beginning of the 2nd line 
Cheers,
guy038
 - 
 - 
@guy038 Thanks again. However (again - I hope I’m not pushing my luck by asking once more!) this only replaces the first instance of the string. I need it to replace every occurence of each string in the document.
i.e. In this document, every time X382 occurs it should be replaced by 1841Census and every X391 should be replaced by 1861 Census.
Is this possible? - 
You have more “logic” in your problem statement than a regular expression can handle, I’m afraid.
In such cases you should probably turn to a scripting plugin, e.g. Pythonscript, that can work with regular expression data, but can also incorporate more logic into it.
 - 
Maybe something like this:
search_repl_pairs_list = [] editor.research(r"(?-s)===<span id='(\D\d\d\d).+\R(\d{4})", lambda m: search_repl_pairs_list.append((m.group(1), m.group(2)))) for tup in search_repl_pairs_list: editor.replace(tup[0], tup[1] + 'Census') - 
Hi, @john-slee and All,
No problem. Let’s give it a new try !
From the initial lines :
===<span id='X382'>X382</span>=== 1841 England - Census transcript - John COOMBE - HouseholdThe regex S/R, described in my previous post, changed it as below :
===<span id='1841Census'>X382</span>=== 1841 England - Census transcript - John COOMBE - HouseholdBut, may be, you would expect the following result, where the string X382 is changed, both, inside the single quotes and outside :
===<span id='1841Census'>1841Census</span>=== 1841 England - Census transcript - John COOMBE - HouseholdIf so, use this new regex S/R, below :
SEARCH
(?-si)(===<span id=).+?(?=</span>.+\R(\d+))REPLACE
\1'\2Census'>\2CensusWhich can be used, indifferently, with the
Replaceor theReplace Allbuttons !Cheers,
guy038
P.S. :
Of course, the Alan’s python script is more powerful !
 - 
@guy038 said in Regex Macro: Creating Macro to Replace variable text with text determined by text on next line:
Alan’s python script is more powerful !
Indeed, especially when one notices in the data that some of the references come before the definitions!
 - 
@Alan-Kilborn Thank you so much, Alan. It’s years since I did any proper programming, though that was a previous occupation. Guess I’m going to have to teach myself to use Python!