Replace string, but maintain substring (convert Markdown to HTML image)
-
I have 500 Markdown files where I need to replace a complete line with another line. This surely can be done using Regex, but I have no clue how.
I want to replace this line (last words of line differ in each file):
![](..\pregit\uploads\1kln-2hdw-4h50/media/image1.jpeg){width="5.1722222222222225in" height="2.5868055555555554in"} + Variable tekst
with:
<img src="/uploads/1kln-2hdw-4h50/image1.jpeg" height="300">
Note that the ‘1kln-2hdw-4h50’ is reused in the new line.
-
Hello, @jeroen-borgman, and All,
Again, a regex S/R is the solution !
I supposed that :
-
The image names may be different
-
The part
1kln-2hdw-4h50
may be different
So, given the input sample text, below :
![](..\pregit\uploads\1kln-2hdw-4h50/media/image1.jpeg){width="5.1722222222222225in" height="2.5868055555555554in"} bla bla blah ![](..\pregit\uploads\9ftu-7abc-10h27/media/My Image.jpeg){width="9.2333in" height="3.555556in"} Variable text ![](..\pregit\uploads\5gya-0hgz-2h36/media/image45678.png){width="10.0in" height="5.0in"} This a test : ( .../media/... ) bla blah !
If you use the following regex S/R :
SEARCH
(?x-is) \Q![](..\pregit\uploads\\E (.+?) /media (/.+?) \) .+
REPLACE
<img src="/uploads/\1\2" height="300">
You’ll get the modified text :
<img src="/uploads/1kln-2hdw-4h50/image1.jpeg" height="300"> <img src="/uploads/9ftu-7abc-10h27/My Image.jpeg" height="300"> <img src="/uploads/5gya-0hgz-2h36/image45678.png" height="300">
Hope that it’s your expected output text ;-))
Notes :
-
First, the in-line modifiers
(?x-is)
means :-
That any
space
char in the regex is NOT taken in account by the regex engine and just helps the user to better identify the different sections of the search regex ((?x)
). In this mode, if you need to search for aspace
character, use, either, the syntax :-
\x20
( The escaped form ) -
[ ]
( A space char, in a class character ) -
A
\
symbol, followed with aSpace
char
-
-
Any dot regex char (
.
) will match a single standard character, only and not an EOL one ((?-s
) -
The search engine carries the search in a
NON-insensitive
way ((?-i)
)
-
-
Then the part
\Q![](..\pregit\uploads\\E
, simply delimits a literal string, between the two\Q
and\E
syntaxes, to be matched, with that exact case -
Then, the part
(.+?)
matches the shortest string of any character before the/media
string, with that exact case, stored as group 1` because of the embedded parentheses -
Now, the part
/media
matches the litteral string /media, with that exact case -
And the following part
(/.+?)
looks for a slash symbol/
followed with the shortest string of any character before an ending parenthesis\)
, stored as group2
because of the embedded parentheses -
Then, the part
\)
matches a literal ending parenthesis -
And, finally the part
.+
matches all the remaining standard characters of current line -
In replacement, all the current line contents are replaced with :
-
The part
<img src="/uploads/
, which rewrites this exact expression, first -
The part
\1\2
, which rewrites the contents of groups1
, then2
-
The part
" height="300">
, which rewrites this exact expression
-
Best Regards,
guy038
P.S. :
You must be aware of a fundamental difference, in regex syntaxes containing variable quantifiers, like
*
,+
,?
,{n,}
and{n,m}
-
You may use the quantifier, by itself
-
You may add the
?
symbol, right after the quantifier
For instance, the regex
abc.+xyz
may not match the same expresions as theabc.+?xyz
will !Against the text - abcdefghijklmnopqrstuvwxyz - abcdefghijklmnopqrstuvwxyz - : :
-
The regex
abc.+xyz
would match the string abcdefghijklmnopqrstuvwxyz - abcdefghijklmnopqrstuvwxyz, i.e. the longest string between theabc
and thexyz
strings -
Whereas the regex
abc.+?xyz
would match the string abcdefghijklmnopqrstuvwxyz, i.e. the shortest string between theabc
and thexyz
strings
Jeroen, just remove one
?
symbol or the two ones, in the search regex above. As you can see, the third line of the sample text is, now, wrongly replaced :-(( -
-
@guy038 said in Replace string, but maintain substring (convert Markdown to HTML image):
<img src=“/uploads/\1\2” height=“300”>
WoW! pure magic!
Thanks for this Guy. I not only like the solution, but also the explanation. -
Hi, @jeroen-borgman, and All,
Thanks for your comment !
The
Free Spacing
regex mode also allows you to place the different parts of your regex in consecutive lines, with possible comments after a#
character, as below :(?x) # FREE SPACING regex mode (?-is) # DOT regex char = 1 STANDARD char and search SENSITIVE to case ^ # START of CURRENT line boundary ( Added to be more RIGOROUS ! ) \Q![](..\pregit\uploads\\E # LITTERAL string ![](..\pregit\uploads\ (.+?) # Part BETWEEN uploads\ and /media ( Group 1 ) /media # LITTERAL string /media (/.+?) # Image NAME ( Group 2 ) \) # LITTERAL string ) The ESCAPED form is necessary as PARENTHESES are REGEX chars ! .+ # REMAINING chars of CURRENT line scanned
Just select all these lines and paste them in the
Find what:
field of the Find dialog ;-))Note that if your regex must contain a
#
char, just place use the escaped syntax\#
or the character class[#]
Cheers,
guy038