Swapping Blocks of Code with Regex
-
Fellow Notepad++ Users,
Could you please help me with the following search-and-replace problem I am having?
I am trying to swap two blocks of code. The first block (Capturing Group 1) is the first 3 lines below, and the second block (Capturing Block 2) is the last 4 lines below.
Here is the data I currently have (“before” data):
<link rel="prev" href="../d/myfile.htm"> <link rel="next" href="../../../l/a/n/ilandsas.htm"> <link rel="up" href="../../../../../ttl/ttl-i.htm"> <link rel="stylesheet" href="../../../../../css/hymn.css"> <script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/myscripts.js"></script>
Here is how I would like that data to look (“after” data):
<link rel="stylesheet" href="../../../../../css/hymn.css"> <script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/myscripts.js"></script> <link rel="prev" href="../d/myfile.htm"> <link rel="next" href="../../../l/a/n/ilandsas.htm"> <link rel="up" href="../../../../../ttl/ttl-i.htm">
To accomplish this, I have tried using the following Find/Replace expressions and settings
Find What = `(<link rel="prev" href=.+?>\R<link rel="next" href=.+?>\R<link rel="up" href=.+?>\R)(?=(<link rel="stylesheet".+?myscripts.js"></script>\R))` Replace With = `\2\1` Search Mode = REGULAR EXPRESSION Dot Matches Newline = CHECKED
My intent was to swap Capturing Group 1 with Capturing Group 2.
But instead, the output was as if my “Replace With” expression was “\2\1\2”.
That is, I ended up with an additional copy of Capturing Group 2 at the end.Obviously this was not the output I desired, and I’m not sure why.
Could you please help me understand what went wrong and help me find the solution?Thank you.
-
A visual on the difference maybe helps:
-
@Dick-Adams-0 said in Swapping Blocks of Code with Regex:
That is, I ended up with an additional copy of Capturing Group 2 at the end.
Obviously this was not the output I desired, and I’m not sure why.The (?=…) lookahead does not consume any of those characters. So everything after your \R is still in the document, and not part of the replacement: it replaces everything from the start of the match through the \R with \2\1, and then the stuff that was in the lookehead (essentially \2) still remains after the replacement is done. Just remove the lookahead wrapper, and it should do what you want.
-
I tried removing "the look ahead wrapper like this:
(<link rel="prev" href=.+?>\R<link rel="next" href=.+?>\R<link rel="up" href=.+?>\R)(?=<link rel="stylesheet".+?hymn.js"></script>\R)
but now it doesn’t replace anything
So I tried removing the look ahead wrapper a different way:
?=(<link rel="stylesheet".+?myscripts.js"></script>\R)
That gave a me an “invalid regex” error.
-
@Dick-Adams-0 said in Swapping Blocks of Code with Regex:
tried removing "the look ahead wrapper like this:
In short: The “lookahead wrapper” portion is everything but XXX in
(?=XXX)
. So I was trying to tell you to remove the(?=
and)
but leave theXXX
in your regex.So your original regex was
(<link rel="prev" href=.+?>\R<link rel="next" href=.+?>\R<link rel="up" href=.+?>\R)(?=(<link rel="stylesheet".+?myscripts.js"></script>\R))
if I just remove the
(?=
which started the lookahead, and the)
which ended it, but left the capture group #2 which used to be inside, it will end up with this regex:(<link rel="prev" href=.+?>\R<link rel="next" href=.+?>\R<link rel="up" href=.+?>\R)(<link rel="stylesheet".+?myscripts.js"></script>\R)
Using that regex, when I start with “before data”:
<link rel="prev" href="../d/myfile.htm"> <link rel="next" href="../../../l/a/n/ilandsas.htm"> <link rel="up" href="../../../../../ttl/ttl-i.htm"> <link rel="stylesheet" href="../../../../../css/hymn.css"> <script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/myscripts.js"></script>
and run that regex, I end up with “after data”:
<link rel="stylesheet" href="../../../../../css/hymn.css"> <script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/myscripts.js"></script> <link rel="prev" href="../d/myfile.htm"> <link rel="next" href="../../../l/a/n/ilandsas.htm"> <link rel="up" href="../../../../../ttl/ttl-i.htm">
This matches your “after data”.
-
Hello, @dick-adams-0, @alan-kilborn, @peterjones and All,
Here is my solution, which can be applied in all cases of text swapping ;-))
- First you add a specific character, NOT YET used in your current file, in front of each concerned block of text
Thus, if I choose the ― character ( Unicode char HORIZONTAL BAR,
\x{2015}
), your INPUT text is temporarily changes to :―<link rel="prev" href="../d/myfile.htm"> <link rel="next" href="../../../l/a/n/ilandsas.htm"> <link rel="up" href="../../../../../ttl/ttl-i.htm"> ―<link rel="stylesheet" href="../../../../../css/hymn.css"> ―<script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/myscripts.js"></script> ―
-
Then using the regex S/R below :
-
SEARCH
(?xs) ^ ― ( .+? ) ― ( .+? ) ― ( .+? ) ― \R
-
REPLACE
\2\3\1
-
You’ll get your expected OUTPUT text :
<link rel="stylesheet" href="../../../../../css/hymn.css"> <script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/myscripts.js"></script> <link rel="prev" href="../d/myfile.htm"> <link rel="next" href="../../../l/a/n/ilandsas.htm"> <link rel="up" href="../../../../../ttl/ttl-i.htm">
Notes :
- You may choose any kind of character to begin a block of text which is to be moved ! For instance, in my previous example, I could have chosen the BULLET char (
\x{2022}
) as shown below :
•<link rel="prev" href="../d/myfile.htm"> <link rel="next" href="../../../l/a/n/ilandsas.htm"> <link rel="up" href="../../../../../ttl/ttl-i.htm"> •<link rel="stylesheet" href="../../../../../css/hymn.css"> •<script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/myscripts.js"></script> •
- This method allows you to select very large
multi-lines
blocks of text as well as simplesingle-line
blocks too !
Here is a general example which deals with
6
blocks of text :-
Group
1
contains3
lines -
Group
2
contains3
lines -
Group
3
contains4
lines -
Group
4
contains1
line -
Group
5
contains2
lines -
Group
6
contains5
lines
So, let’s begin this text with a
¤
character in front of linesA
,D
,G
,K
,L
andN
¤line A Group 1 Line B Group 1 Line C Group 1 ¤Line D Group 2 Line E Group 2 Line F Group 2 ¤Line G Group 3 Line H Group 3 Line I Group 3 Line J Group 3 ¤Line K Group 4 ¤Line L Group 5 Line M Group 5 ¤Line N Group 6 Line O Group 6 Line P Group 6 Line Q Group 6 Line R Group 6 ¤
Then, the following regex S/R :
-
SEARCH
(?xs) ^ ¤ ( .+? ) ¤ ( .+? ) ¤ ( .+? ) ¤ ( .+? ) ¤ ( .+? ) ¤ ( .+? ) ¤ \R
-
REPLACE
\3\4\6\2\5\1
would produce this expected OUTPUT text :
Line G Group 3 Line H Group 3 Line I Group 3 Line J Group 3 Line K Group 4 Line N Group 6 Line O Group 6 Line P Group 6 Line Q Group 6 Line R Group 6 Line D Group 2 Line E Group 2 Line F Group 2 Line L Group 5 Line M Group 5 line A Group 1 Line B Group 1 Line C Group 1
Notes :
-
As you can verify, the resulting order respects the replacement numbering of the groups !
-
The single line block ( Group
4
) was moved upwards -
The two-lines block ( Group
5
) has not been moved during the process
As a conclusion, if we don’t use the free-spacing mode (
(?x)
), just TWO rules to remind :-
Between the leading part
(?s)^
and the trailing part¤\R
, just add, in the search regex, as many¤(.+?)
syntaxes than the number of blocks concerned by the replacement -
In the replacement regex, just re-organize your blocks, from
1
ton
, whatever you want to ! You could even duplicate some blocks using, from the above example, the replacement syntax\3\4\3\6\2\5\2\1\6
;-))
Best Regards,
guy038